This article covers how to format and save your data in the tagged format before uploading to Emcien.
The Tagged Format
The tagged format contains no headers and is great for unstructured data. The tagged format is most commonly used for analyses containing free-form text.
The tagged format consists of two required columns (which may be empty) and no headers. There may be a maximum of 1000 categories per data file, with a maximum of 1000 items per line. An example of the tagged format is below:
Some important things to note about the Tagged Format:
- The first column is the date, or date and time. Although the column is required, it is allowed to be empty. Details on the date format are below.
- The second column is the transaction ID. As with the date, this column is required, but may be blank as well. The transaction id can be any unique identifier to a transaction.
- An item may be represented as “Category::item” or as “item”.
- Any cells containing commas must be surrounded by double quotes
Date Format Details
The date column can have 4 different formats, but the format must be consistent throughout the data file. Below are possible examples of what they may be.
|YYYY-MM-DDTHH:MM:SS||2012-07-15T02:23:44 (T, t or space must be present between date and time; Z, z, or nothing after time)|
|unixtime||1431209618 (assumed time zone is UTC+0; 1-10 digit number)|
|none||No value is requred|
Not sure which format is best for you? The Emcien team can help prepare your data. Contact us at [email protected].