Features and commands

User interface

EDA is a consequently designed and developed analysis environment for interactive, exploratory research. The EDA design is screen, not paper oriented.

The user communicates with EDA through a simple, yet powerful command language designed for true interactive work. On-line help and a command editing facilities are provided.

Print file

The results of a terminal session may be placed into a print file, either completely or selectively. Print files can be reviewed (selection of images) and pretty-printed. Additional options and a text editor facilitate the transfer of the results to a text formatter.

Data matrix

Variables to be analyzed are held in a work area (data matrix). Variables are referred to either by name or position. Case identifiers and a grouping variable are associated with each work area. Variables may be tied together to form groups.

In addition to its name and descriptor a variable can be documented with a document. Documents (text of any length) are structured using various levels of documentation allowing for selective search and display. Numerical information may be embedded within documents and then searched and extracted for analysis.

Data analysis

The data analysis commands offered in EDA cover three main areas: (see table I and II for an overview of the analysis commands.)
  1. Exploratory data analysis
  2. Multidimensional methods
  3. Cluster analysis

Data input and files

Like most systems, EDA handles raw data files and its own files (archive files), stored in an EDA specific file system. (A normal user is completely shielded from operating system specifics).

There are facilities for data input from the interactive terminal and dialogue-assisted building of documented system files.

Special attention has been paid to the communication with standard packages, especially SPSS and SAS (EDA produces SPSS setups, as well as SAS Data steps).

The PC version features a read/write interface to spreadsheet programs.

Data editing and correction

A number of data editing and correction commands are grouped within a special module within the EDA analysis system, the data editor, providing more security (changes may be undone). It provides text-editor-like commands for data editing.

Transformations

Powerful commands allow for conditional and unconditional transformations using algebraic and logical expressions. The transformation is designed to encourage numerical experimentation. In addition to the standard operations and functions (more than 100) EDA includes statistical and a large number of exploratory functions. Computations may be performed on variables, matrix rows and columns and scalars (including individual matrix elements and scalar variables). It is for instance easy to replace all outliers by the median of the corresponding variable. Transformations are easily performed on several variables using control structures (loops).

Case Selection

A case selection lets you analyze selected observations without altering the data matrix. Cases may be selected based on logical expressions, group memberships and the like.

Macros

Macro and abbreviation facilities are provided. Together with user formatted output procedures, control structures allowing for repetitive execution, and scalar variables commands may be tailored to one's need or new commands built. Macros may be stored in macro-archives. EDA comes with a sample macro library, containing for instance an interface to the SPAD data analysis program.

Toolbox

The toolbox contains a series of general purpose commands used for file handling and other data and text processing tasks, including sorting, data checking, file concatenation, modification and many other useful operations performed on files.

Text editor

The EDA text editor is used for editing output from commands, documents, macros, variable descriptions, case identifiers or any text file.

Miscellaneous commands

Other facilities include data aggregation, generation of "artificial" data (random etc.), matrix manipulation (e.g. transposition of the work area), weighting, counting, percent checking and many other transformation tool for common tasks.

For teachers many tools are implemented: teacher-student communication, user or group profiles, as well as user monitoring.

As EDA is a consequently developed interactive program, a large set of commands deal with the control of the user's environment in order to meet the specific needs of each user or group of users. EDA contains many more commands and facilities which cannot be adequately described in this short text.

Commands (main analysis commands)

Exploration

  • ADDFIT fit an additive model to a table
  • BOXPLOT box and whisker plot with outlier display; parallel boxplots and reexpression diagnostics
  • BREAK (cross) break of variables: (interval coding).
  • COMPARE compare variables
  • DIAGNOSTIC diagnostic routines (e.g. for assessing normality of variables)
  • DLINE density line
  • DISPLAY basic univariate statistics, including hinges, fences and Tukey's biweight; trimming options
  • FREQUENCY frequency tables
  • HISTOGRAM several forms of histograms
  • LINE resistant line, Tukey-line, and LSQ line
  • LIST data lister: numerical and coded displays: sort options for cases and variables; blanking options. Coded displays (many forms)
  • LOWESS Scatterplot smoothing
  • MAP simple cartography
  • MDIAG multivariate diagnostic methods
  • PLOT plots one, two, three or more variables; many symbol types, outlier elimination, large printer size plots, tool for detailed analysis (zooming, case identification) and transformations etc.
  • PROFILE displays profiles of single cases or groups
  • REGRESS biweight multiple regression (Tukey)
  • REEXPRESS search for appropriate reexpression
  • QSUMMARY quick (numerical) summaries of variables
  • SHOW conditional numerical and coded displays of variables
  • SMOOTH free smoothing (running medians)
  • STEMLEAF stem and leaf display (simple, back-to-back, groupwise)
  • SUMMARY numerical summaries and letter values
  • TRACES hinge and letter value tracing
  • XTAB crosstabulation

    Dimensional analysis

  • ANACOR correspondence analysis (Benzecri)
  • CANON canonical analysis
  • CFIX fit two configurations
  • CFIT configuration comparison (Procrustes rotation and other techniques)
  • FACTOR Principal components and principal axes, options include Gabriel's biplot.
  • MDS multidimensional scaling (Kruskal-Shepard)
  • MINISSA smallest space analysis (Guttman-Lingoes)
  • SCORES factor scoring
  • TSCALE metric dimensional scaling

    Cluster Analysis

  • CLUSTER non-hierarchical clustering (4 methods)
  • HIERARCHY hierarchical clustering (6 methods)
  • VHIERARCH Hierarchical clustering on variables (10 methods)
  • TREE detailed analysis of the hierarchical tree.

    Related commands

  • BASSOC compute binary association measures
  • C1,C2 analyzes the result matrices from dimensional analyses (configurations): numerical and coded displays of loadings and scores, plotting (including simultaneous plots of configurations), profiles of configurations etc.
  • CORRELATE compute a correlation matrix (options are variance-covariance matrix, rank transformation and robustness transformation and jackknife)
  • DISTANCE computes a distance matrix
  • GANALYSIS analyze and compare groups (numerical and coded summaries). Used for detailed
  • GSUMMARY analysis of a cluster analysis.
  • MATRIX inspect a matrix (distance, correlation): numerical and coded lists, checking, matrix manipulation
  • ROTATE rotates a configuration
  • TRACES group analysis (using boxplots)