Agenda

Communication

Join us on Slack


Rooms

All activities will take place in one of three rooms:

  • (Rm. 1/2) 3A001 & 3A002 learning studios (two rooms joined into one large room)
  • (Rm. 1) 3A001 learning studio
  • (Rm. 2) 3A002 learning studio
  • (Rm. 3) 3A003A/B used for workshops only (please be quiet when crossing skywalk, due to event in atrium, thanks!)

Alternative to workshops: if you're not attending workshops you're welcome to use the hack room in 3A001

Childcare: 1S018 & 1S019 8:45am-5:45pm

Lactation Room: 1S011 (ask Jessica Minnier or Lilly Winfree for access, or another organizer/volunteer who can find them; it is a designated private room with a fridge and sink)

Schedule

(times subject to change)

  • 8:00 - 9:00: (Rm. 1/2) Registration + coffee/tea/pastries + cRaggy submissions open (link)
  • 8:45 Childcare Opens
  • 9:00 - 9:10: (Rm. 1/2) Introduction (slides) (cRaggy slides)
  • 9:10 - 9:45: (Rm. 1/2) Keynote: Alison Hill (link)
  • 9:45 cRaggy submissions due
  • 9:45 - 10:30: (Rm. 1/2) Coffee break + Panel Discussion
  • 10:30 - 12:00: (Rm. 2/3) Workshops (link)
    • (Rm. 2) A gRadual intRoduction to Data Wrangling
    • (Rm. 3) Intermediate Machine Learning = Introduction to Deep Learning using TensorFlow with R
  • 12:00 - 1:15: (Rm. 1) Lunch (catered) + cRaggy Browsing/Voting
  • 1:15 - 1:50: (Rm. 1/2) Keynote: Kara Woo (link)
  • 1:50 - 2:15: (Rm. 1) Break/Socializing
  • 2:15 cRaggy votes due
  • 2:15 - 3:45: (Rm. 2/3) Workshops (link)
    • (Rm. 2) A gRadual intRoduction to Data Visualization
    • (Rm. 3) Using R with Databases
  • 3:45 - 4:00: (Rm. 1/2) Break (snacks/beverages)
  • 4:00 - 6:00: (Rm. 1/2) Lightning talks (link)
    • cRaggy Winner Presentations
    • Lightning Talk Session 1 ~ 50 minutes
    • 15 minute break
    • Lightning Talk Session 2 ~ 45 minutes
  • 5:45: Childcare closes
  • 6:00: (Rm. 1/2) Closing remarks
  • 6:00: (Rm. 1/2) Social
  • (Portland) Bird of Feather Dinners (link)


Keynotes

Alison Hill

Title: Big Magic with R: Creative Learning Beyond Fear

Bio: Alison is an Associate Professor of Pediatrics at Oregon Health Science University (OHSU) and the Assistant Director of OHSU's Center for Spoken Language Understanding, home to the Computer Science graduate education program. Her current research aims to evaluate whether Natural Language Processing methods can be translated into meaningful outcome measures for individuals with neurodevelopmental disorders like Autism, Down Syndrome, and Fragile X Syndrome. Her work has been published in numerous peer-reviewed journals and book chapters, and has been funded by the Oregon Clinical and Translational Research Institute, the National Institutes of Health Office of Research on Women's Health, and the National Institute on Deafness & Other Communication Disorders. Alison began using and teaching R seven years ago. She teaches four graduate-level courses using R, and is the author of "Working with Data in the Tidyverse" to be offered by DataCamp.com. She is also a co-author of the book "blogdown: Creating Websites with R Markdown" with Yihui Xie and Amber Thomas.

Abstract: Inspired by the book "Big Magic: Creative Living Beyond Fear" by Elizabeth Gilbert, Alison will talk about the five essential ingredients needed to creatively learn R and why these elements are also essential for advanced users to take their R skills to the next level. You will hear practical advice for when, where, and how to start a project in R, and how your learning can add value- both to your own knowledge and to contribute to the larger community of R learners. Along the way, she will share recommended resources and evidence-based strategies for project-based learning. Alison's background working with both new and advanced R users gives her a unique perspective on this topic.

Kara Woo

Title: Anyone Can Play Git/R: Tips for First-Time Contributions to R Packages

Bio: Kara Woo is a research scientist in data curation at Sage Bionetworks. She has a master's in library and information science from the University of Washington and is interested in data management, data visualization, and open and reproducible research. Kara is also a co-maintainer of accidental aRt, a blog of data visualizations gone beautifully wrong.

Abstract: Contributing to R packages and projects can be a rewarding way to give back to the tools you use and to improve your own programming skills in the process. In this talk, Kara will discuss some of the varied ways to contribute to existing projects. Anyone, whether a seasoned programmer or someone brand new to R, can make useful contributions to R packages. Kara will draw on her experience working on ggplot2 to offer strategies for finding your way in an unfamiliar codebase, and to give insights into the relationship between maintainers and contributors.



Workshops

A gRadual intRoduction to Data Wrangling (Beginner)

10:30-12:00; Rm. 2: 3A002

Ted Laderas and Chester Ismay

Data comes in a variety of formats and even working with spreadsheet data can pose tricky problems. In this workshop, you'll use the `tidyverse` to transform messy data into nice summary tables and get a handle on dates, times, and strings as well. You'll find that using the `tidyverse`, it's easier to make and share reproducible data wrangling workflows that will increase your overall productivity. You might even have some fun doing it! The workshop materials are as a webpage here and on GitHub here.

A gRadual intRoduction to Data Visualization (Beginner)

2:15-3:45; Rm. 2: 3A002

Ted Laderas and Chester Ismay

An exploration into the `ggplot2` package. Focus will be on creating and tweaking basic plots while reviewing the underlying Grammar of Graphics throughout. You'll also see an easy way to turn many `ggplot2` graphics into dynamic plots with a simple function call. The workshop materials are as a webpage here and on GitHub here.

Introduction to Deep Learning using TensorFlow with R (Intermediate Machine Learning)

10:30-12:00; Rm. 3: 3A003A/B

Paige Bailey

TensorFlow is an open-source software library for numerical computing (similar to Python’s numpy) that has made waves recently for its impact in deep learning. The R interface to TensorFlow allows you to work productively using the high-level Keras and Estimator APIs – and, when you need more control, provides full access to the core TensorFlow API. Once the models have been created, you can visualize them both in RStudio and using TensorBoard.

Agenda:

  • What is deep learning?
  • What are TensorFlow and Keras?
  • How do you build a neural network?
  • How do you visualize what you have just created?

Using R with Databases (Intermediate)

2:15-3:45; Rm. 3: 3A003A/B

Aaron Makubuya

Would you like to learn how R is used to advance valuable information from data stored in relational databases? With the proliferation of data and increased reliance on data-driven knowledge, many companies need skilled individuals who can extract meaningful information form data in order to drive informed decision making. This intermediate workshop will introduce you to various methods of using R with databases to derive valuable information from data. The workshop will give you hands-on experience in the following areas:

  • how to connect to a database from R
  • how to create and manage database objects.
  • how populate tables in a database, and issue SQL queries to retrieve and transform your data using R.



Lightning talks

Speaker Affiliation Talk Title
Paige Bailey Microsoft Illuminating TensorBoard outputs – how to structure your neural network model topologies to take advantage of TensorFlow’s graph visualizations
cRaggy speaker 1 TBD
cRaggy speaker 2 TBD
Caitlin Hudon R-Ladies; Shelfbucks NULL: When Missing Data is Valuable  
Jay Lee Reed College Ain't No Party Like a Political Party   
Jonathan Nolis Nolis, LLC Using deep learning in R to generate offensive license plates  
Zachary Foster Oregon State University Taxa: An R package for parsing and manipulation of taxonomic data  
Daniel Anderson University of Oregon Contribute to Open Source with Pretty Slides  
Frank Farach Slalom Consulting Continuous Data Quality Improvement with R
Sondra Stegenga University of Oregon Data Ethics: Considerations for Special Populations  
David Severski Starbucks Enterprise Technology Risk Management - With R!  
Eric Leung OHSU DMICE Underutilized functions for data exploration: tips from exploring hundreds of variables  
Dror Berel Fred Hutch A Systematic approach to retrieve data from S4 object-oriented system using a tidy solution
Norma Padron American Hospital Association Exciting applications and uses of R in health care

speakers

Speaker Talk Description
Caitlin Hudon Messy data is an unfortunate fact of life, but even missing data can still be useful. This lightning talk will cover: + How to extract value from missing data + How to diagnose your missing data + Strategies for handling missing data
Jay Lee How do national-level politicians tweet? Using some statistical learning methods (principal component analysis, clustering, generalized regression, tree-based methods), we can parse out the ways in which these politicians' tweets are most similar and different to each other, and test some models to predict their political party. Analysis of these methods reveals some unexpected and insightful words that distinguish the party of a tweeter.
Jonathan Nolis Recurrent Neural Networks are great tools for generating sequences of text, and are easy to use with R and the Keras package. I trained an RNN using a list of rejected license plate requests from the state of Arizona, with hilarious results. In this talk I will go through my process and some of the output.
Zachary Foster Taxonomic data is commonplace in high-through sequencing research, digitized museum/herbarium records, and species-occurrence databases, but its hierarchical nature makes it hard to use. The R package taxa provides classes for the storage and manipulation of taxonomic data. Classes range from building blocks to project-level classes storing user-defined data mapped to a taxonomy. It includes flexible parsers and functions modelled after dplyr for manipulating a taxonomy and associated data.
Daniel Anderson The xaringan package makes it easy to produce beautiful slide decks directly from R Markdown by leveraging the remark.js library, while custom CSS scripts can be used to extend and style slides. In this talk, I'll discuss my experience contributing the University of Oregon theme, and how others can modify, develop, and/or contribute their own themes. The talk will have a specific focus on beginners, discuss "where to begin", and the details of contributing a theme to the package.
Frank Farach Data quality issues can threaten the validity of data products and the decisions they support, becoming critical points of failure. Early detection of potential issues can help reduce negative impacts and provide important context. In this talk, Frank will demonstrate a lightweight, R-based workflow for specifying, running, and logging the results of data quality checks that you can easily incorporate into the pre-processing stages of your data analysis and machine learning pipelines.
Sondra Stegenga Big data has quickly become a hot topic in business, politics, social media, health, and educational systems. As technology continues its rapid advancement so to come new opportunities and considerations in research and application. This talk will focus on a mixed-methods systematic review of research implications, ethical considerations, and opportunities with big data specific to vulnerable and special populations such as underrepresented groups, young children, and individuals w/disabilities.
David Severski A growing number technology and information security practitioners are interested in quantitative risk management practices, such as OpenFAIR, which have a heavy dependence on simulations to measure risk. To help enterprise users who are often not familiar with R perform these simulations, I developed the Evaluator library (available on CRAN). This talk will provide a quick overview of Evaluator and the design choices made to make it accessible to a broad audience.
Eric Leung The first thing you’ll want to do with new data is to explore and understand it before making inferences. Base R functions like summary() and head() are helpful to get a general sense of the data but are limited. Alternatively, graphics are a great visual method for exploring data, but they don’t scale very well with hundreds of variables. Here I share three functions I wish I knew about earlier and have found useful when exploring a lot of data.
Dror Berel S4 is a popular object oriented system in R. Bioconductor community utilize it to store and analyze high-throughput omic data. However, retrieving raw data from S4 object is not trivial, and require ‘unlocking’ the slots using a special operator. This work will demonstrate a systematic ‘tidy’ approach to retrieve raw data from publicly available data sets in S4 format. It will utilize a map-reduce approach, implemented in R via the purrr package.
Norma Padron In the specific context of health care both the societal value of using data and its risks are exacerbated due to a fragmented data infrastructure. Practical approaches to explore, analyze and visualize data are needed so that practitioners and experts can co-design solutions. Unfortunately, statistical knowledge remains scarce across health care delivery practitioners and R, open data and open source tools present a unique opportunity to democratize knowledge. Here I present some applications.


cRaggy graphics show-and-tell

Analyze BIKETOWN data, join it with any other data that makes sense to you, and make a summary graphic to tell a story. Best submissions will get to give a lightning talk to describe their work.

The first annual cRaggy graphics show-and-tell will be an informal event held during CascadiaRConf. Everyone can participate by submitting an entry (as a group or individually), by upvoting entries that you like during the day, and by listening to the 3 lightning talks about the most popular entries at the end of the day.

Everybody gets a look at one dataset (details below), announced on Friday May 18th (about 2 weeks before the conference) on the conference website and in an email sent to everyone who’s pre-registered. The dataset is not mtcars or one of the usual suspects! It is public and easy-to-get-to, but relatively obscure and fun to play with.

On June 2, conference attendees post an entry by 9 am (look for the signs for location in the main meeting room), which consists of:

  • One graph printed on 8 1/2” * 11” paper that explores the cRaggy dataset (which you can subset or augment with other data if you’d like)
  • A URL pointing to the code that produced your graph (preferably on GitHub)
  • Share your code in a print-out (or project it during a possible lighting talk)
  • Contact info so you can be notified during the conference that you are invited to do a lighting talk

Entries will be posted so everyone can see, comment, and vote with a green cRaggy dot that you get at registration. Remember (lifted from TidyTuesday guidelines):

  • This is for fun
  • It is NOT about criticizing the original article or graph. Real people made the graphs, collected or acquired the data! Focus on the provided dataset, learning, and improving your techniques in R.
  • It is NOT about criticizing or tearing down your fellow #rstats practitioners! Be supportive and kind to each other! Like other’s entries if you want to hear more from them!

Votes will be counted after the cutoff at 2 pm, 3 speakers will be notified immediately, and lightning talks will be around 4:30 pm. Each lighting talk will be about 5 minutes, plus time to connect your computer to the projection screen.

This is a social data project in R - along the lines of Tidy Tuesday exercises



Hack room

A good portion of the day will be taken up with workshops. You do not have to attend a workshop. If you don't one of the two big rooms (3A001) will be set aside for a socializing, discussing, coding and learning together.

There's no format per se for the hack room. Feel free to get discussions going on a topic you're interested in, or learn some new package together.



Birds of Feather Dinners

IMPORTANT: Note that the Portland Starlight Parade will be running from 9-11 pm. It's a good idea to pick dinner spots that avoid the parade area

  • Job seekers
  • Data Viz
  • Beginners
  • Biology/environment
  • Public health

Join the #bof-dinners Slack channel to chat with fellow conference attendees about BOF's

The topics above were found to be some of the most requested BOF ideas - you're of course free to gather around any other topic!





About us

CascadiaRConf is a regional R conference for the Pacific Northwest - brought to you by pdxrlang

Contact

Template by Bootstrapious. Ported to Hugo by DevCows