Show Menu

Sumo Logic Cheat Sheet by

A quick reference for Sumo Logic to stick on your desk and a good way to learn the basics about the product.
english     analytics     data     mining     sumologic     logs     indexation

 Metadata

_sourceHost
The host name of the Source. For local Sources the name of the Source is set when you configure the Source. For remote Collec­tors, this field uses the remote host's name. The _sourc­eHost metadata field is populated using a reverse DNS lookup. If the name cannot be resolved, _sourc­eHost is displayed as localhost.
_sourc­eHo­st=­*M­ySQL*
_sourceName
The name of the log file, determined by the path you entered when you configured the Source.
_sourceName=/path/to/file/
_sourceName=*path*
_sourc­ename = "­/va­r/l­og/­tom­cat­/lo­gs/­foo­bar.lo­g"
_sourceCategory
This field is created when you enter text into the Source Category field at Source config­uration time. Log categories can be somewhat complex, as many log files may belong to more than one logical category.
_sourc­eCa­teg­ory=OS*
  
_sourceCategory=*Application*
  
_sourc­eCa­teg­ory­=*Audit
_collector
Returns results from the named Collector only. Entered when a Collector is installed and activated.
_colle­cto­r=p­ubl­ic_­cloud
_source
Returns results from the named Source only. Entered when a Source is config­ured.
_sourc­e=­*sy­slog*
While _sour­cename = *api.log works, _sour­cename = "­*a­pi.l­og­" will fail.
List all catego­ries: * | count by _sourc­eCa­tegory | fields -_count

 Input format

keyvalue
For KVP type logs. The keyvalue operator allows you to get values from a log message by specifying the key paired with each value.
| keyvalue "age"
| keyvalue infer "hairColor", "lastVisit"
| keyvalue regex "=(.*?)[,|}]" keys "serviceinfo.IP", "loggingcontext.region", "request.method" as ip, region, method
| keyvalue auto
csv
The csv operator allows you to parse Comma Separated Values (CSV) formatted log entries. It uses a comma as the default delimiter.
 
Parse comma delimited fields
| csv_raw extract 1 as user, 2 as id, 3 as name
 
Parse a stream query and extract search terms
"St­arting stream query" | parse "­que­ry=[*], queryI­d" as query | csv query extract search­Terms, op1, op2, op3
 
Specify an escape, and quote character
csv fieldName escape­='\', quote=''' extract A, B, _, _, E, F
JSON
The JSON operator allows you to extract values from JSON input. Because JSON supports both nested keys and arrays that contain ordered sequences of values, the Sumo Logic JSON operator allows you to extract single top-level fields, multiple fields, nested keys and keys in arrays.
 
Extracting a single top-level field
_sour­ceC­ate­gor­y=s­tream RawOut­put­Pro­cessor "­\"me­ssa­ge­\"­" | parse "­exp­lai­nJs­onP­lan.st­rea­m]*­" as jsonobject | json field=­jso­nobject "­ses­sio­nId­" | fields -jsono­bject
 
Extracting multiple fields
_sour­ceC­ate­gor­y=s­tream RawOut­put­Pro­cessor "­\"me­ssa­ge­\"­" | parse "­exp­lai­nJs­onP­lan.st­rea­m]*­" as jsonobject | json field=­jso­nobject "­ses­sio­nId­", "­cus­tom­erI­d" | fields -jsono­bject
 
Extracting a nested key
* | json field=­jso­nobject "­met­a.t­ype­"
 
Finding values in a JSON array
* | json field=­jso­nobject "­bas­eli­neI­nte­rva­ls"
 
Refer to one specific entry in an array
* | json field=­jso­nobject "­bas­eli­neI­nte­rva­ls[­1]"
 
Using the nodrop option
* | json field=­jso­nobject "­bas­eli­neI­nte­rva­ls[­0]" nodrop
 
Note: The JSON operator also supports the nodrop option, which allows messages containing invalid JSON values to be displayed.
 
Using wildcard (*)
_sour­ceC­ate­gor­y=O­365* | json "­Act­or[­*].Ty­pe" as Actortype
 
json auto works by searching for json blobs beginning at the end of the message. Usually logs begin with a preamble, such as a timestamp. In cases where content appears at the end of the message after the json blob, the extraction could fail. Having the json blob at the end of the message is recomm­ended, as having it in the middle could cause extraction failure.
| json auto
KVP: Key-Value Pairs. Logs formatted this way look something like this:
[2019­-12-24 23:59:­59.380 +1100] age=42 name="Rick Deckar­d" hairCo­lor­="br­own­" lastVi­sit­="20­18-­04-19 13:00"
infer: Default mode. Uses an internal list of regex to extract the value for a key.
regex: In Regular Expression mode, you must explicitly match keys and values based on a regex.
auto: Extract up to N fields. N is 100 by default.

 Conditions

if
There are two forms of ternary expression you can use in Sumo Logic queries: one is constr­ucted using the IF operator, and the other uses the question mark (?) operator. These expres­sions are used to evaluate a condition as either true or false, with values assigned for each outcome. It is a shorthand way to express an if-else condition.
| if(sta­tus­_code matches "­5*", 1, 0) as server­_error
  
| status­_code matches "­5*" ? 1 : 0 as server­_error
in
The In operator returns a Boolean value: true if the specified property is in the specified object, or false if it is not.
| if (statu­s_code in ("50­0", "­501­", "­502­", "­503­", "­504­", "­505­", "­506­", "­401­", "­402­", "­403­", "­404­"), "­Err­or", "­OK") as status­_co­de_type
where
The where operator must appear as a separate operator distinct from other operators, delimited by the pipe symbol ("|").
//We recommend placing inclusive filters before exclusive filters in query strings
  
| where status­_code matches "­4*"
  
| where !(stat­us_code matches "­2*")
isBl­ank
The isBlank operator checks to see that a string contains text. Specif­ically, it checks to see if a character sequence is whites­pace, empty ("") ,or null. It takes a single parameter and returns a Boolean value: true if the variable is indeed blank, or false if the variable contains a value other than whites­pace, empty, or null.
| where isBlan­k(user)
isEmpty
The isEmpty operator checks to see that a string contains text. Specif­ically, it checks to see whether a character sequence is empty ("") or null. It takes a single parameter and return a Boolean value: true if the variable is indeed empty, or false if the variable contains a value other than empty or null.
| if(isE­mpt­y(s­rc_­ip)­,1,0) as null_i­p_c­ounts
isNull
The isNull operator takes a single parameter and returns a Boolean value: True if the variable is indeed null, or false if the variable contains a value other than null.
| where isNull(src_ip)

 Data extraction

pars­e(r­egex)
Best for variable patterns. Also called the extract operator; enables users to extract more complex data from log lines using regular expres­sions. Can be used to extract nested fields.
| parse "­Con­ten­t=*­:" as content
 
Parsing an IP address
| parse regex "­(?<­ip_­add­res­s>­\d{1­,3}­\.­\d{1­,3}­\.­\d{1­,3}­\.­\d{1,3}) "
 
Indicating an OR condition to use non-ca­pturing groups
| parse regex "list 101 (accep­ted­|de­nied) (?<­pro­toc­ol>.*?) "
parse(anchor)
Best for predic­table patterns. Also called parse anchor, parses strings according to specified start and stop anchors and labels them as fields for use in subsequent aggreg­ation functions in the query such as sorting, groupi­ng...
| parse "­Use­r=*­:" as user
split
The split operator allows you to split strings into multiple strings, and parse delimited log entries, such as space-­del­imited formats.
_sourc­eCa­teg­ory­=colon | parse "] " as log_level, text | split text delim=':' extract 1 as user, 2 as accoun­t_id, 3 as sessio­n_id, 4 as result
fields
The fields operator allows you to choose which fields are displayed in the results of a query.
_sourceCategory=access_logs | parse using public/apache | fields method, status_code
limit
The limit operator reduces the number of raw messages or aggregate results returned.
| count by _sourc­eCa­tegory | sort by _count | limit 5
matc­hes
The matches operator can be used to match a string to a pattern.
| if (agent matches "­*M­SIE­*",­"­Int­ernet Explor­er",­"­Oth­er") as Browser
| if (agent matches "­*F­ire­fox­*",­"­Fir­efo­x",B­rowser) as Browser
timeslice
The timeslice operator segregates data by time period.
| timeslice 1h | count by _times­lice
  
_sour­cen­ame­=*­tom­cat* | timeslice by 5m | count by _times­lice
 
Output of last example:
# Time        _count
1 09/07/2017 11:25:00 AM +1000  9,234
2 09/07/2017 11:30:00 AM +1000  14,496
3 09/07/2017 11:35:00 AM +1000  15,988
4 09/07/2017 11:40:00 AM +1000  3,383
trace
A trace operator acts as a highly sophis­ticated filter to connect the dots across different log messages. You can use any identi­fying value with a trace operator (such as a user ID, IP address, session ID, etc.) to retrieve a compre­hensive set of activity associated to that original ID.
| trace "ID=( [0-9a-­fA-F] {4} )" "­7F9­2"
About limit: Can be used in Dashboard Panels, but in the search they must be included after the first group-by phrase.
About timesl­ice: Timeslices greater than 1 day cannot be used in Dashboard Live mode.
About trace: Not supported in Live Dashboards or any continuous query.

 Crunch numbers

count
count_distinct
count_frequent
Used in conjun­ction with the group operator and a field name. Only the word by is required. The count function is also an operator in its own right and therefore can be used with or without the word by.
| count by url
| count_distinct(referrer) by status_code
_sourcename=*tomcat* | count_distinct(_sourceName) group by _sourceHost | sort by _count_distinct desc
sum
Sum adds the values of the numerical field being evaluated within the time range analyzed.
| sum(by­tes­_re­ceived) group by _sourc­eHost
avg
The averaging function (avg) calculates the average value of the numerical field being evaluated within the time range analyzed.
| avg(re­que­st_­rec­eived) by _timeslice
median
In order to calculate the median value for a particular field, you can utilize the Percentile (pct) operator with a percentile argument of 50.
| parse "­val­ue=­*" as value | pct(value, 50) as median
outl­ier
Given a series of time-s­tamped numerical values, using the outlier operator in a query can identify values in a sequence that seem unexpe­cted, and would identify an alert or violation, for example, for a scheduled search.
_sourc­eCa­teg­ory­=II­S/A­ccess | parse regex "­\d+­-\d­+-\d+ \d+:\d­+:\d+ (?<­ser­ver­_ip­>\S+) (?<­met­hod­>\S+) (?<­cs_­uri­_st­em>­/\S+?) \S+ \d+ (?<­use­r>\S+) (?<­cli­ent­_ip­>[\.\d]+) " | parse regex "\d+ \d+ \d+ (?<­res­pon­se_­tim­e>­\d+)­$" | timeslice 1m | max(re­spo­nse­_time) as respon­se_time by _timeslice | outlier respon­se_time window­=5,­thr­esh­old­=3,­con­sec­uti­ve=­2,d­ire­cti­on=+-
sort
The sort operator orders aggregated search results. The default sort order is descen­ding.
| count as page_hits by _sourc­eHost | sort by page_hits asc
top
Use the top operator with the sort operator, to reduce the number of sorted results returned.
| top 5 _sourc­eca­tegory
min
The minimum function returns the smaller of two values.
| min(1, 2) as v
// v = 1
max
The maximum function returns the larger of two values.
| max(1, 2) as v
// v = 2
About count_­fre­que­nt: You can use the count_­fre­quent operator in Dashboard queries, but the number of results returned is limited to the top 100 most frequent results.
About top: Can be used in Dashboard Panels, but in the search they must be included after the first group-by phrase.

Geo lookup

Sumo Logic can match an extracted IP address to it's geogra­phical location on a map. To create the map, after parsing the IP addresses from log files, the lookup operator matches extracted IP addresses to the physical location where the addresses origin­ated.
| parse "­rem­ote­_ip­=*]­" as remote_ip | lookup latitude, longitude, countr­y_code, countr­y_name, region, city, postal­_code, area_code, metro_code fromge­o:/­/de­fault on ip = remote_ip | count by latitude, longitude, countr­y_code, countr­y_name, region, city, postal­_code, area_code, metro_code | sort _count

logcompare

The logcompare operator allows you to compare two sets of logs: baseline (histo­rical) and target (current). To run a LogCompare operation, you can use the LogCompare button on the Messages tab to generate a properly formatted query
| logcompare timeshift -24h
About logcom­pare: Not supported in Dashbo­ards.

logreduce

The LogReduce algorithm uses fuzzy logic to cluster messages together based on string and pattern simila­rity. Use the LogReduce button and operator to quickly assess activity patterns for things like a range of devices or traffic on a website.
| logreduce
About logred­uce: Not supported in Dashbo­ards.

save

Using the Save operator allows you to save the results of a query into the Sumo Logic file system. Later, you can use the lookup operator to access the saved data. The Save operator saves data in a simple format to a location you choose.
| save /shared/lookups/daily_users
About save: Not supported in Dashbo­ards.

 Visualization

transpose
Turn a list into a table in the Aggregates tab.
transpose row [row fields] column [column fields]
  
_sour­ceC­ate­gor­y=L­abs­/Ap­ach­e/A­ccess | timeslice 5m | count by _times­lice, status­_code | transpose row _timeslice column status­_code

Download the Sumo Logic Cheat Sheet

11 Pages
//media.cheatography.com/storage/thumb/tme520_sumo-logic.750.jpg

PDF (recommended)

Alternative Downloads

Share This Cheat Sheet!

 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          More Cheat Sheets by TME520

          dwm (english) Cheat Sheet
          grep (english) Cheat Sheet