Additional Detail for Manual Automation

Programatically controlling EmcienPatterns allows you to provide a closed-loop, end-to-end system that analyzes data as it changes, extracts the predictive rules within that data, then makes predictions on new incoming data using those rules.  

This allows your organization to implement a predictive process that keeps up with changing data without having to manually update any models or review results.  Allowing internal processes, workflows, dashboards to be fed with up-to-date predictions of outcomes provides a forward looking strategy using leading indicators rather than lagging ones.

API Overview

To get started lets first understand the key components to the EmcienPatterns automation:

  • Analysis / Creating Predictive Rules 
  • Prediction Making

NOTE:  To use automation you will need to be familiar with cURL, FTP/SFTP, HTTP GET & POST Requests, JSON.  These automation features are avaiable on OS X and Ubuntu.

  1. Uploading data
    • To analyze a data file it must be loaded into the SFTP directory for your user on EmcienPatterns (for more information about the SFTP directory, see our article on Uploading Data).  This can be easily done via script using a utility like LFTP, that can be used in a noninteractive capacity.  The standard SFTP command is interactive only, making it difficult to script.
      • The command would be in the following format:
      • lftp -c "open -­u <ftp username>,<ftp password> <ftp site address>; put -O <your file location.csv>"
  2. Analyzing data
    • To programatically start an analysis job you will use an HTTP POST request via cURL that will look something like this:
      • curl --insecure -qLSfs -w "n%{http_code}" -d "<$DATA>" -D - "<$URL>" | grep "reports/d*/log" | awk -­F"/" '{ print $8 }'
      • The $URL is $BASE_URL/admin/companies/$COMPANY_ID/reports.json
      • The $DATA is:

      • auth_token=$Your_Auth_Token&report[original_file_name]=$Your_Filename&report[name]=$Name_For_This_Report&outcome=$Outcome_Column&projects[]=$Project_ID#
        • The last “&” and everything that follows can be left out if you want to use the default project
        • $Your_Filename is the name of the data file to create the analysis report from.  This was the file ftp’d in step 1.
        • $Name_For_This_Report is the desired name for the report
           
    • A full example call with might look like this:
      • curl --insecure ­-qLSfs -w "n%{http_code}"-­d "auth_token=0y123eo6uxfghvk3gj2numxyzq&report[original_file_name]=banded_network_attac k.wide.csv&report[name]=report_banded_network_attack.wide.csv&outcome=PRIORITY&project s[]=5" -D -"https://analyze.example.com/admin/companies/4/reports.json?auth_token=0y123e o6uxfghvk3gj2numxyzq" -o /dev/null | grep "reports/d*/log" | awk -F"/" '{ print $8 }'
        • The piping should return the report ID as a number grep’d from a string like the following:
          • https://analyze.example.com/admin/companies/4/reports/38776895/log
  3. Analysis job status
    • To monitor the progress of a report your program/script will need to call an API method to check on the status of a analysis job to see if it completed, failed or is still in progress.  
    • You will use an HTTP GET request via curl that looks something like this:
      • curl --insecure -qLSfs -w "n%{http_code}" "<$URL>?auth_token=<$AUTH_TOKEN>" | jq '.report?.state'
      • The $URL will be $BASE_URL/admin/companies/$COMPANY_ID/reports/$REPORT_ID/log.json
        • Your $REPORT_ID is the report ID from the Step above when you started the job.
      • This request will respond with a JSON object containing a nested JSON object under the “report” key. This nested JSON object contains a key called “state” which will be “report-ready” if the report is finished or “failed” or “invalid” if the report did not succeed.
      • You can extract this value using jq as detailed in Step 1: Getting Report Status from JSON DRAFT v.2.17 3 above. 
    • An example of this would be:
      • curl --insecure -qLSfs -w "n%{http_code}" "https://analyze.example.com/admin/companies/4/reports/38776895/log.json?auth _token=0y123eo6uxfghvksdre3gj2numxyzq" | jq '.report?.state'

Making predictions involves applying the rules extracted in the first step of the Analysis to new data.  To do this involves just a few simple steps:

  1. Upload the test data
    See the upload technique described in the Rules Extraction process above.
  2. Create a prediction job
    Applies the pre-computed rules from the Rules Extraction phase to a new data set and makes predictions on each transaction.
    • Use a “POST” request via curl with the following to create a prediction job
      • $URL:  $BASE_URL/predict/shards.json
    • An example call would look somethin like this:
      • curl --insecure -qLSfs -w "n%{http_code}" -d "auth_token=zohlpe7n4z8e6hnak2jehzkc&shard[params][filename]=laptop_failure_diagnostic _test.wide.csv&shard[params][rules_report_id]=11105529&shard[params][encoding]=UTF-8&s hard[params][delimiter]=comma&predict_shard[name]=command_line_test" -D ­- "https://analyze.example.com/predict/shards.json" | awk -F"," '{ print $1 }' | cut -d":" -f2
        • Parameters:
          • Shard[params][filename]
            – this is the file containing the new test transactions uploaded via FTP
          • Shard[params][rules_report_id]
            – the report id containing the rules previously computed using the report method above
          • Shard[params][encoding]
            – typically 'UTF-8', but can be set to others as needed (see 'override results' note below)
          • Shard[params][delimeter]
            – typically 'comma', but can be set to others as needed (see 'override results' note below)
          • Shard[params][name]
            – the name you wish to give to this prediction job 
        • Note: 'Override Results' – you can view what the values for the above parameters could be using the 'override results' area on the analyze screen.  Anything passed in the UI can be passed in via this API method
    • The above example will launch a prediction for the report id 11105529 that was previously computed using the report command, and test a file of new data called “laptop_failure_diagnostic_test.wide.csv” previously ftp’d up to the server via the command line. The response from this curl command will include the id of the job which will be used in other commands to get status and results. 
       
  3. Download results of prediction job
    Programatically pull down the overview of the predictions, transaction level predictions and detail “reasons” (matching rules) for each prediction to be used in a downstream process.
    • Example HTTP GET request:
      • curl "https://analyze.example.com/predict/shards/<ID#>/download/predictions.csv.gz?auth_token=<$AUTH_TOKEN>"  -o <output filename.csv.gz>
    • Replace the ID# in the example above with the ID returned from the prediction job above. When the job is complete (status 200), it will return the data for the API endpoint you have chosen.
    • There are 4 endpoints :
      1. Prediction Overview: 

        • https://analyze.example.com/predict/shards/<ID#>/download/prediction_overview.csv.gz?auth_token=<$AUTH_TOKEN> 
      2. Predictions & Top Reasons:

        • https://analyze.example.com/predict/shards/<ID#>/download/predictions.csv.gz?auth_token= <$AUTH_TOKEN> 
      3. All Reasons:

        • https://analyze.example.com/predict/shards/<ID#>/download/reasons.csv.gz?auth_token=<$AUTH_TOKEN>
      4. Rules: 

        • https://analyze.example.com/predict/shards/<ID#>/download/rules.csv.gz?auth_token=<$AUTH_TOKEN>