EmcienPatterns Data Requirements

Version 1.7 – Aug. 20, 2010

Overview

Emcien makes loading product data simple by using an industry standard “delimited file” as its input format.

Emcien’s data format contains a combination of items, transaction dates, and financial data in a simple format to source the EmcienPatterns system. The overall format of these files often resembles a spreadsheet or report you may use today.

The basic format of the file is standard delimited format that simply breaks the data into columns with a comma ” , ” character serving as the identifier for the columns. In this format, optional values will show only with the beginning and end delimiter and no value for that field. Depending on the size of the file, most spreadsheet applications can open, read and save delimited files correctly.

Data Elements

There are only 4 required data items in this data feed, trans_id and item_id, the others are optional but the more you provide, the more EmcienPatterns can show you.

Here are the possible fields:

Column 1: trans_id (Required, 32 characters max)
Any unique identifier that repeats for a single transaction, but changes for the next transaction. Transaction ids must be grouped together, they cannot be spread out through the file.

Column 2: item_id (Required, 32 characters max, example could be a SKU number)
A unique item identifier, typically something like a SKU number or option code for your product would be here, but really, any string that uniquely identifies something is fine here.

Column 3: trans_date (Required, mm/dd/yy or mm/dd/yyyy)
A date when the transaction occurred or any key date you wish to track regarding this transaction. Patterns will track all clusters of items through time using this date if it’s present.

Column 4: trans_volume (Required, numeric unsigned 999999999 format)
The volume of the entire transaction, not of the item_id. It is assumed to be ‘1’ if the field is left blank.

Column 5: trans_total_price (optional, numeric unsigned 99999999.99 format)
The total price of the transaction for all the items within it. This amount DOES NOT have to add up to the same amount as the item_prices if you wish to include discounts, taxes, etc.

Column 6: item_vol (optional, numeric unsigned 999999999 format)
The volume of the item_id within the specific transaction. It is assumed to be ‘1’ if left blank. Example: An transaction with 2 items on it, Oil and Oil Filter, could have a ratio of something like 6 to 1 of oil per oil filter. So the Oil item_id would have a item_vol of 6, the item_vol for oil filter would be 1 and the trans_volume would be 1.

Column 7: item_price (optional, numeric unsigned 99999999.99 format)
The individual price of the item_id, typically MSRP or some other un-discounted price is used, but it can be any price you wish to track.

Column 8: item_name (optional, 32 characters max)
An identifier that you want to use as a more human readable version of the item_id. Typically a short name or alias for the item_id. An example would be that the item_id is RJT00345 but the item_name could be ‘Green Striped Lawn Chair’.

Column 9: item_attr_1 (optional, 32 characters max)
Any text identifier, typically used for categorizing the item. Examples include product category, manufacturer, etc.

Column 10: item_attr_2 (optional, 32 characters max)
Any text identifier, typically used for categorizing the item. Examples include product category, manufacturer, etc.

Example Data:


This table shows 2 transactions (10001 & 10002), the first having 3 items purchased the second transaction having 2 items purchased. You can also see that some fields are left blank as they aren’t required.

The first 2 rows of this data would look like this in the delimited file:

10001,Abc23,10/2/09,1,239.65,1,,Wood Rake,Lawn Care,ACME Inc.
10001,J2456,10/2/09,1,239.65,4,,60W Light Bulbs,Lighting,LightCo

There is no limit to the number of transactions, but typically using anywhere from 3 months to 2 years of sales data gives the best results for detecting buying patterns. Emcien can suggest he best amount of data for your product line and market to get the best results.

Acceptable Data Values

The Emcien data file accepts values using any combination of the following characters types.

  • Alphabetic characters – (a-z, A-Z)
  • Numeric characters – (0-9)
  • Special characters – Everything EXCEPT the comma “ , “ character is available to use within strings. The comma character is reserve for only delimiting columns.