Introduction
|
Spreadsheets serve an important function in most work & research environments.
Spreadsheets work best for certain tasks, and need to be formatted in a specific way to serve that function.
|
Formatting data tables in Spreadsheets
|
Never modify your raw data. Always make a copy before making any changes.
Keep track of all of the steps you take to clean your data.
Organize your data according to tidy data principles.
Record metadata in a separate plain text file.
|
Formatting problems
|
Avoid using multiple tables within one spreadsheet.
Avoid spreading data across multiple tabs (but do use a new tab to record data cleaning or manipulations).
Record zeros as zeros.
Use an appropriate null value to record missing data.
Don’t use formatting to convey information or to make your spreadsheet look pretty.
Place comments in a separate column.
Record units in column headers.
Include only one piece of information in a cell.
Avoid spaces, numbers and special characters in column headers.
Avoid special characters in your data.
|
Data type tips and tricks
|
Sometimes the issue with with the data type, not the data.
There are clues in your spreadsheet software that will alert you to possible issues.
You can get a sense of what might be happening by saving the data as a csv and looking at the text file.
|
Dates as data
|
|
Quality assurance
|
|
Exporting data
|
Data stored in common spreadsheet formats will often not be read correctly into data analysis software, introducing errors into your data.
Exporting data from spreadsheets to formats like CSV or TSV puts it in a format that can be used consistently by most programs.
|