.. docs/operations.rst ****************** Operations Section ****************** The ``Operations`` section is a list of operations/transformations to be performed in order. Each operation should be added as a dictionary to the ``Operations`` list in the order to be performed. The name of the operation to be performed should be provided as the key, which will contain a nested dictionary of key-value pairs for ``fields``, ``args``, and ``kwargs``. What needs to be provided as a value to each key is contextual to the operation being performed, but generally: * ``fields`` are a list of fields the operation will be applied to. * If the string '*' is supplied instead of a list, the operation is applied to all currently existing fields. * If any other string is provided, the string is converted to a list with only that string as an entry. * ``args`` are the arguments to be supplied to the operation. * ``kwargs`` are keyword-arguments the operation may use. The ``Operations`` section is optional. Available Operations ==================== cast ---- The ``cast`` operation type casts fields into a different type. Available Types for Casting ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Types values for casting, and accepted aliases: * int: "int", "integer", "long" * float: "float", "double" * Decimal: "decimal" (the python type :py:class:`decimal.Decimal`) * bool: "bool", "boolean" * str: "str", "string" Type Cast Examples ^^^^^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"cast": {"fields": "Total", "args": ["int"]}} ] change place ------------ The change place operation changes an in-place operation into an out-of-place operation. Can only be used with supported operations. Unsupported operations will simply return the row unchanged. Supported operations: * hashmap * slice change place Examples ^^^^^^^^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"change-place": {"fields": "Patient Country", "args": ["Country Code", "hashmap", { "United States of America": "USA", "United States": "USA", "US": "USA", "Canada": "CAN", "Antarctica (the territory South of 60 deg S)": "Antarctica"}]}} ] create field ------------ The ``create field`` operation adds a new field to the source dictionary. By default new fields are empty strings. If ``args`` are supplied, the new field will take the value supplied if there is only one argument, or take the whole list of arguments if the length of args is greater than 1. Create Field Example ^^^^^^^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"create": {"fields": "Total"} ] duplicate --------- The ``duplicate`` operation makes a duplicate of the field supplied to ``fields``. Only one field should be supplied to the ``fields`` key. A new field will be created with matching values for every value supplied to ``args``. Raises a ValueError if any fields in ``args`` already exist. Duplicate Examples ^^^^^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"duplicate": {"fields": "Phone 1", "args": ["temp_phone"]}} ] Filter Operations ----------------- There are two operations to filter data. To use a filter on fields, supply two values to the ``args`` list. The first value should be a conditional to test, and the second value should be the value to test the field against. exclude Remove rows where the value of a given field meets a conidtional. include Keeps only rows where the value of a given field meets a conditional. Filter Conditionals ^^^^^^^^^^^^^^^^^^^ Filter comparison values and accepted aliases: * equal: "equal", "=", "==", "isequal", "is equal" * not equal: "not equal", "notequal", "!=", "~=", "<>", "not equals to", "not =" * in: "in", "contains", "substring" * not in: "not in", "notin" * greater than: "gt", "greaterthan", "greater than", ">" * less than: "lt", "lessthan", "less than", "<" * greater than or equal: "ge", "greater or equal", "greater than or equal", ">=", "≥" * less than or equal: "le", "less or equal", "less than or equal", "<=", "≤" * boolean: "bool", "boolean", "truthiness", "truthy", "falsy" * none: "isnone", "none" Filter Examples ^^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"include": {"fields": "Last Name", "args": ["equal", "SMITH"]}} ] .. code-block:: json-object "Operations": [ {"exclude": {"fields": "Total", "args": ["<", 100.00]}} ] hashmap ------- The hashmap operation takes a single dictionary argument to the ``args`` list, and replaces each key found in the source with the matching value. Hashmap Examples ^^^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"hashmap": {"fields": "Country", "args": [{ "United States of America": "USA", "United States": "USA", "US": "USA", "Canada": "CAN", "Antarctica (the territory South of 60 deg S)": "Antarctica"}] } } ] list ---- Make a list at ``field`` using all fields supplied as ``args``. If ``field`` is also included in ``args``, it's original value will be lost. Only one field should typically be supplied to ``fields``. If multiple ``fields`` are supplied, then the same list will be assigned to each provided field. All fields should be cast into the correct type for planned usage prior to making list. list Examples ^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"list": {"fields": ["Full Name"], "args": ["First name", "Last Name"]}} ], slice ----- To enable Python like slices (such as ``"String to slice"[0, -3]``), Detakon implements the slice operation using the Python :py:class:`slice` function to generate a slice_object. The first value supplied in the ``args`` list should be the start for the slice, and the second should be the stop of the slice. Values should be integers. If a non-integer value is provided, it is assumed to be None. Indexing begins at 0, and negative values can be used. Slice Examples ^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"slice": {"fields": ["Phone Number"], "args": [0, -3]}} ], Python String Methods --------------------- Any Python :py:mod:`string` method should be accessible as an operation. The values supplied to the ``args`` list should be the same order of arguments supplied to the appropriate method. Appropriate keyword arguments can be supplied using a nested dictionary with the key ``kwargs``. If the method does not require any ``args`` or ``kwargs``, then these items can be omitted. String Method Examples ^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"split": {"fields": "Phone 1", "kwargs": {"sep": "-"}}}, {"upper": {"fields": ["Email", "Last Name", "First Name"]}} ] join ---- Join operation calls the string method to join a list of strings. Due to how the str.join method works, this operation was added to make it work correctly with Detamaps. The fields values are which fields contain lists to join, and args[0] should contain the separator to be joined on. Certain operations (such as split), may result in lists, which can later be joined with this operation. String Method Examples ^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"join": {"fields": "filepath", "args": "_"}} ] Math Operations =============== Math operations use the fields values as destinations of the operations. Math operations are typically intended to have one value passed to fields. The operation is performed sequentially on each field passed to args, with the final output being assigned to the destination specified in fields. If a field in args contains a list, the operation is sequentially performed on each element of the list, before continuing to the next field. sum --- Perform addition on each field passed to args to return the final sum to the specified field. All lists will be flattened and summed. By default, summation starts at 0. To set a start value either: * Pass a field name for the start value to kwargs with the key "start_field" * Pass a number as the start value to kwargs with the key "start_value" sum Examples ^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"sum": {"fields": ["Total"], "args": ["Subtotal", "Tax", "Shipping"]}} ], subtract -------- Perform subtraction on each field passed to args to return the final final to the specified field. All lists will be flattened and subtracted. By default, subtraction starts at 0. To set a start value either: * Pass a field name for the start value to kwargs with the key "start_field" * Pass a number as the start value to kwargs with the key "start_value" subtract Examples ^^^^^^^^^^^^^^ .. code-block:: json-object "Operations": [ {"sum": {"fields": ["Total"], "args": ["Coupon"], "kwargs": {"start_field": "Subtotal"}}} ], Complete Operations Section Examples ==================================== .. code-block:: json-object "Operations": [ {"hashmap": {"fields": "Country", "args": [{ "United States of America": "USA", "United States": "USA", "US": "USA", "Canada": "CAN", "Antarctica (the territory South of 60 deg S)": "Antarctica"}] } }, {"duplicate": {"fields": "Phone 1", "args": ["temp_phone"]}}, {"split": {"fields": "Phone 1", "kwargs": {"sep": "-"}}}, {"upper": {"fields": ["Email", "Last Name", "First Name"]}} ]