Handling spreadsheets: good practices and OpenRefine – POSTPONED

Lisanna Paladin   2023-07-31   Comments Off on Handling spreadsheets: good practices and OpenRefine – POSTPONED

Date/Time
Date(s) - 2023-07-31
09:00 CEST - 12:00 CEST

Location
Hybrid

Categories

:warning: Due to low interest this course has been postponed :warning:

Course content

Good data organization is the foundation of any (research) project. And usually this type of work starts from spreadsheets.

We spontaneously organise data in spreadsheets in a human-readable way, but that sometimes that doesn’t work well when the same data needs to be processed by a computer. Preparing our data in an automatically readable way would help tremendously in analysing and plotting the data, changing formats, or even applying actions in bulk – so good practices in spreadsheets organisation can make our life easier. That’s what we will teach in this lesson.

In this lesson you will learn:

  • Good data entry practices – formatting data tables in spreadsheets
  • How to avoid common formatting mistakes
  • Approaches for handling dates in spreadsheets
  • Basic quality control and data manipulation in spreadsheets
  • Exporting data from spreadsheets

In addition, we will explore the usage of OpenRefine, a free desktop application described as "a power tool for working with messy data". OpenRefine is most useful where you have data in a simple tabular format such as a spreadsheet, a comma separated values file (csv) or a tab delimited file (tsv) but with internal inconsistencies either in data formats, or where data appears, or in terminology used. OpenRefine can be used to standardize and clean data across your file.

It can help you in:

  • Getting an overview of a data set
  • Resolve inconsistencies in a data set, for example standardizing date formatting
  • Split data up into more granular parts, for example splitting up cells with multiple authors into separate cells
  • Match local data up to other data sets – for example, in matching forms of personal names against name authority records in the Virtual International Authority File (VIAF)
  • Enhance a data set with data from other sources

Training materials

The course content will be highly inspired to two The Carpentries lessons:

Registration

Please notice: the course registration works on a first-comes-first-served basis.

Bookings

Bookings are closed for this event.