Skip to content

Support parsing a headerless jarray (expected behavior is similar to csv_parser) #30321

Closed
@RoeiDimi

Description

@RoeiDimi

Component(s)

pkg/stanza

Is your feature request related to a problem? Please describe.

csv_parser operator supports getting a list of headers and parsing an input csv line into attributes. there is no such support for jarray lines (ie receiving a similar comma separated line that is wrapped in brackets)

Describe the solution you'd like

Adding a flag 'is_jarray' to csv_parser. to demonstrate it, the operator config can then look something like this:

operators:
    - type: csv_parser
      header: TimeGenerated,SourceIP,SourcePort,DestinationIP,DestinationPort,Protocol,SentBytes,ReceivedBytes,ExtID
      parse_from: body
      parse_to: attributes
      is_jarray: true

and then add a suitable generate parse function (like generateJarrayParseFunc in this draft I created for example)

This will allow maximal code reuse as the expected behavior is basically the same as for a csv line besides the parsing (receiving headerless data, getting a pre-determined list of headers in the config and parsing it into attributes)

Describe alternatives you've considered

Implementing jarray_parser with the downside being most of csv_parser's code being duplicated

Additional context

This is a part of a bigger project in which we are using otel-collector as an infrastructure and receive many types of data from a client. The client's sent data is always a form of json and this use case is a subset in which the json is a simple headerless jarray and so we need a way to parse it in this manner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions