Thinking about this in terms of data used to test out EDA packages, here are some brief thoughts on what data to use for EDA, to test out EDA software:
-
Some specific data sets
-
Structure that is needed for data sets
- Missing Data
- Data of the following types:
- Factor
- Ordered factor
- Continuous
- Integer
- String/character
- Date/Time
Perhaps requiring that datasets contain some mix of these types of data, to understand their strengths/weaknesses?