This article covers how to format and save your data in wide format before uploading to EmcienPatterns.
Wide format is the most commonly used and easiest to start with. Consists mostly of user-defined columns with no required column headings.
Wide format is the more universal and most commonly used EmcienPatterns format. This format can be used for multi-dimensional data, such as demographics or configurable products. In wide format, each transaction is identified by a single row of data.
An example of wide format is displayed below.
This example contains four transactions, each represented by a single row of data. In this data, each transaction represents a client and their associated demographics data.
Unlike long format, wide format has no required columns and supports user-defined columns.
Wide Format Details
Headers are required for each column in wide format. The file must contain at least two user-defined columns with a maximum of 1,000 total columns.
While the data is allowed to contain any UTF-8 characters the header must be in lower ASCII.
Most columns for wide format data are user-defined. EmcienPatterns also offers optional columns, which are listed in the table below. The use of these optional columns will enable certain EmcienPatterns features, such as date and time trends. If used, the optional column header must exactly match the column header listed in the table below.
The date and time when the transaction occurred. Field values must be formatted using ISO or Unix standards.
Field values can contain date only, or date and time. The column heading indicates the date and time format used.
Important: If used, this column must be first.
The volume applied to all attributes of the transaction. This column is used to calculate the strength of connections in your data.
Field values must be non-negative values consisting only of digits.
Important: If you use this column and not the transaction_date, transaction_time, or transaction_time_unix columns, this column must be first.
If you use this column and the transaction_date, transaction_time, or transaction_time_unix column, this column must be second.
This column is defined by you. Each user-defined column must use a unique column header.
Important: Your data file must contain at least two user-defined columns.
Field values must be wrapped in double quotes (“) and cannot be longer than 255 characters.
All double quotes inside any string should be escaped with another double quote. For example:
“String 3″ Roll” should be changed to “String 3”“ Roll”