Logo Crossweb

Log in

No account yet? Forgot password

Przypomnij hasło

close Wypełnij formularz.
Na Twój adres e-mail zostanie wysłane link umożliwiający zmianę hasła.
Send
This event has already taken place. Check upcoming events

Clean up your data screening process with reporteR

Event:
Clean up your data screening process with reporteR
Event type:
Meetup
Category:
IT
Topic:
Date:
03.12.2020 (thursday)
Time:
19:00
Language:
English
Price:
Free
City:
Place:
Online Event
Address:
On your computer
Description:

# Stream URL

- https://youtu.be/djSbNBa2S_c


# Talk

- Clean up your data screening process with _reporteR_


# Event Sponsored by Jumping Rivers

https://www.jumpingrivers.com/


# Details

- webinars http://whyr.pl/webinars/

- donate http://whyr.pl/donate/

- join Why R? Slack http://whyr.pl/slack/

- join Meetup http://tiny.cc/WarsawRUG

- format: 45 min talk + 15 min for Q&A

- comments: ask YouTube live chat


# Speakers


- University of Copenhagen


### Claus Ekstrøm PhD

is a professor in biostatistics at the University of Copenhagen, Denmark. He is the creator and contributor to a number of R packages (**reporteR**, **MESS**, **MethComp**, **SuperRanker**) and is the author of "The R Primer" book. He has previously given R tutorials at useR 2016, eRum 2018, and ASAs Conference on Statistical Practice 2018, and won the C. Oswald George prize from Teaching Statistics in 2014.


### Anne Helby Petersen

is a PhD student in biostatistics at the University of Copenhagen, Denmark. She is the primary author of several R packages, including **reporteR**. She has taught statistics and R in numerous courses at the University of Copenhagen with students coming from a wide range of backgrounds, including science, medicine and mathematics.


# Talk description


## Clean up your data screening process with **reporteR**

Data cleaning and data validation are the first steps in practically any data analysis, as the validity of the conclusions from the analysis hinges on the quality of the input data.


Mistakes in the data can arise for any number of reasons, including erroneous codings, malfunctioning measurement equipment, and inconsistent data generation manuals. Consequently, it is essential to enable topic experts who are knowledgeable about the context and data collection procedure to partake in the data quality assessment since they will be better at identifying potential problems in the data. However, they may not have the technical skills to work with the data themselves.


The reporteR package (formerly known as dataMaid) makes it easy to produce a document that less R-savvy collaborators can read, understand and use to decide “do these data look right?” and documents which potential errors were considered. Both will help ensure subsequent reproducible data science and document the data at all stages of the quality assessment process.


The package includes both very user-friendly one-liner commands that auto-generates data overview reports, as well as a highly customizable suite of data validation and documentation tools that can be moulded to fit most data validation needs. And, perhaps most importantly, it was specifically build to make sure that documentation and validation go hand in hand, so we can clean up any unstructured messy data cleaning process.

Profile of employers

Similar events