Python Commands#
log-store has the ability to run custom Python commands, which can modify and filter logs. These commands work just like any of the built-in commands; however, they are written in Python and can do almost anything Python code can do.
Basic Structure#
log-store’s ability to call Python code relies upon it being well-structured in a format expected by log-store. The basic format of the code is as follows:
from typing import Optional
def process(log: dict) -> Optional[dict]:
return log
A few things to note about this structure:
* The function to be called by log-store MUST be named process
, and take a single dict
argument, and return
either None
or dict
.
* The timestamp field, in the format passed to the process
function, should be included if the log is returned. Modification
to the timestamp’s value can be made; however, the format (ms since epoch) must remain the same or other commands might not work.
* You should always check to ensure a field you are going to modify is included in the log, before accessing it. This
is typically done using something like: if 'field_name' in log:
. Failure to do so could result in a Python KeyError
.
* Attempting to read from STDIN will cause the function to hang, and could require log-store to be restarted. Writing to
STDOUT or STDERR will result in logs being printed to the console (or whatever is capturing those outputs).
* There is no way to accurately aggregate logs, because you function is only called once-per-log until the limit is satisfied.
Future release of log-store will allow for aggregation via Python.
Debugging Commands#
For basic commands, you can easily debug them by using Test box at the bottom of the “Python Commands” interface. For more complex commands, it is recommended that you develop and debug them in a separate IDE. This can easily be done by using the following code snipit:
import sys
import json
from typing import Optional
def process(log: dict) -> Optional[dict]:
# Write your code here
return log
if __name__ == '__main__':
for line in sys.stdin:
in_log = json.loads(line.strip())
ret_log = process(in_log)
if ret_log is not None:
print(json.dumps(ret_log))
Logs can then be sent to the script, one per line, and the result of your command will be printed. You can capture sample
logs from log-store by using the json
command, and simply copy-and-pasting them into a test file.
Continually testing your function against a test file is helpful as you will get the same logs for each run.
Once your command is working, simply copy-and-paste it back into the log-store “Python Commands” interface. It is a good idea to test your function once more on fresh data.
Filtering Logs#
You can easily filter logs with a Python command by simply returning None
instead of the log. This will indicate to
log-store that you want to filter this log from the results. It is typically faster to filter a log in Python than
combining your Python command with the where
command.
log-store will continue to call your function with new logs until the overall output of the search query matches the requested limit. For example, if your function filters every other log, and the standard limit of 50 logs was used, your function would be called ~100 times.
Importing Modules#
All the standard built-in Python 3.10 modules can be used by your command. Some of the more common and useful modules are: * datetime - For manipulating dates * json - For parsing or generating JSON; see the lift command * re - Regular expressions; see the extract command * string - Common string operations
Importing external or custom Python modules is not yet supported by log-store.
Performance#
All of log-store’s built-in commands are written in Rust, and compiled into log-store. Calling them is very efficient. Using a Python command is less efficient because the Python code has to be interpreted, and then run against the logs. Therefore, you should expect Python commands to be slower than their built-in counterpart. For example, extracting data via regular expressions, parsing JSON, splitting strings, or renaming fields should all be done with built-in commands. However, if you need to perform multiple actions, they should all be done in a single Python command. As always, test your code against the built-in commands, and pick the one that is the fastest for your use case.