This article covers how to format and save your data in tagged format before uploading to EmcienPatterns.
Tagged format contains no headers and is great for unstructured data. Tagged format is most commonly used for analyses containing free-form text.
Formatting
Tagged format consists of two required columns (which may be empty) and no headers. There may be a maximum of 1000 categories per data file, with a maximum of 1000 items per line. An example of the tagged format is below:
Some important things to note about Tagged Format:
- The first column is the date, or date and time. Although the column is required, it is allowed to be empty. Details on the date format are below.
- The second column is the transaction ID. As with the date, this column is required, but may be blank as well. The transaction id can be any unique identifier to a transaction.
- An item may be represented as “Category::Item” or as “Item”.
- Any cells containing commas must be surrounded by double quotes
Date Format Details
The date column can have 4 different formats, but the format must be consistent throughout the data file. Below are examples of the possible date formats.
Format |
Example |
YYYY-MM-DD | 2012-07-15 |
YYYY-MM-DDTHH:MM:SS | 2012-07-15T02:23:44 (T, t or space must be present between date and time; Z, z, or nothing after time) |
unixtime | 1431209618 (assumed time zone is UTC+0; 1-10 digit number) |
none | No value is requred |
Not sure which format is best for you? The Emcien team can help prepare your data. Contact us at [email protected].