API

SegmentCollection

class pydifact.segmentcollection.AbstractSegmentsContainer(extra_header_elements: List[Union[str, List[str]]] = None, characters: Optional[Characters] = None)

Abstract base class of subclasses containing collection of segments.

AbstractSegmentsContainer is the superclass of several classes such as RawSegmentCollection and Interchange and contains methods common to them.

Implementation detail: Subclasses must set HEADER_TAG and FOOTER_TAG.

Parameters:
  • extra_header_elements – A list of elements to be appended at the end of the header segment (same format as Segment constructor elements).

  • characters – The set of control characters

segments

The segments that comprise the container. This does not include the envelope (that is, the header and footer) segments. To get the envolope segments, use as get_header_segment() and get_footer_segment().

characters

The control characters (a Characters object).

add_segment(segment: Segment) AbstractSegmentsContainer

Append a segment to the collection.

Note: skips segments that are header or footer tags of this segment container type.

Parameters:

segment – The segment to add

add_segments(segments: Union[List[Segment], Iterable]) AbstractSegmentsContainer

Append a list of segments to the collection.

Passing a UNA segment means setting/overriding the control characters and setting the serializer to output the Service String Advice. If you wish to change the control characters from the default and not output the Service String Advice, change characters instead, without passing a UNA Segment.

Parameters:

segments (List or iterable of Segment objects.) – The segments to add.

classmethod from_segments(segments: Union[List, Iterable], characters: Optional[Characters] = None) AbstractSegmentsContainer

Create an instance from a list of segments.

Parameters:
  • segments (list/iterable of Segment) – The segments of the EDI interchange.

  • characters – The set of control characters.

classmethod from_str(string: str, parser: Optional[Parser] = None, characters: Optional[Characters] = None) AbstractSegmentsContainer

Create an instance from a string.

Parameters:
  • string – The EDI content.

  • parser – A parser to convert the tokens to segments; defaults to Parser.

  • characters – The set of control characters.

Return the footer segment or None if there is no footer.

get_header_segment() Optional[Segment]

Return the header segment or None if there is no header.

get_segment(name: str, predicate: Callable = None) Optional[Segment]

Get the first segment that matches the requested name.

Parameters:
  • name – The name of the segment to return.

  • predicate – Optional predicate that must match on the segments to return.

Returns:

The requested segment, or None if not found.

get_segments(name: str, predicate: Callable = None) list

Get all segments that match the requested name.

Parameters:
  • name – The name of the segments to return.

  • predicate – Optional callable that returns True if the given segment matches a condition.

Return type:

list of Segment objects.

serialize(break_lines: bool = False) str

Return the string representation of the object.

Parameters:

break_lines – If True, inserts line break after each segment terminator.

split_by(start_segment_tag: str) Iterable

Split the segment collection by tag.

Assuming the collection contains tags ["A", "B", "A", "A", "B", "D"], split_by("A") would return [["A", "B"], ["A"], ["A", "B", "D"]]. Everything before the first start segment is ignored, so if no matching start segment is found at all, the returned result is empty.

Parameters:

start_segment_tag – the segment tag we want to use as separator

Returns:

Generator of segment collections. The start tag is included in each yielded collection.

validate()

Validate the object.

Raises an exception if the object is invalid.

class pydifact.segmentcollection.FileSourcableMixin

For backward compatibility

For v0.2 drop this class and move from_file() to Interchange class.

classmethod from_file(file: str, encoding: str = 'iso8859-1', parser: Optional[Parser] = None) FileSourcableMixin

Create a Interchange instance from a file.

Raises FileNotFoundError if filename is not found. :param encoding: an optional string which specifies the encoding. Default is “iso8859-1”. :param file: The full path to a file that contains an EDI message. :rtype: FileSourcableMixin

class pydifact.segmentcollection.Interchange(sender: str, recipient: str, control_reference: str, syntax_identifier: ~typing.Tuple[str, int], delimiters: ~pydifact.control.characters.Characters = ':+,? '', timestamp: ~datetime.datetime = None, *args, **kwargs)

An interchange (started by UNB segment, ended by UNZ segment)

Optional features of UNB are not yet supported.

Functional groups are not yet supported.

Messages are supported (see get_message()), but are optional: interchange segments can be accessed without going through messages.

classmethod from_segments(segments: Union[list, Iterable], characters: Optional[Characters] = None) Interchange

Create an instance from a list of segments.

Parameters:
  • segments (list/iterable of Segment) – The segments of the EDI interchange.

  • characters – The set of control characters.

:returns a (UNZ) footer segment with correct segment count and control reference.

It counts either of the number of messages or, if used, of the number of functional groups in an interchange (TODO).

get_header_segment() Segment

Return the header segment or None if there is no header.

get_messages() List[Message]

parses a list of messages out of the internal segments.

:raises EDISyntaxError if constraints are not met (e.g. UNH/UNT both correct)

TODO: parts of this here are better done in the validate() method

validate()

Validate the object.

Raises an exception if the object is invalid.

class pydifact.segmentcollection.Message(reference_number: str, identifier: Tuple, *args, **kwargs)

A message (started by UNH segment, ended by UNT segment)

Optional features of UNH are not yet supported.

Return the footer segment or None if there is no footer.

get_header_segment() Segment

Return the header segment or None if there is no header.

validate()

Validates the message.

:raises EDISyntaxError in case of syntax errors in the segments

property version: str

Gives version number and release number.

Returns:

message version, parsable by pkg_resources.parse_version()

class pydifact.segmentcollection.RawSegmentCollection(extra_header_elements: List[Union[str, List[str]]] = None, characters: Optional[Characters] = None)

A way to analyze arbitrary bunch of edifact segments.

Similar to the deprecated SegmentCollection, but lacking from_file() and UNA support.

If you are handling an Interchange or a Message, you may want to prefer those classes to RawSegmentCollection, as they offer more features and checks.

validate()

This is just a stub method, no validation done here.

class pydifact.segmentcollection.SegmentCollection(*args, **kwargs)

For backward compatibility. Drop it in v0.2

Will be replaced by Interchange or RawSegmentCollection depending on the need.

add_segment(segment: Segment) SegmentCollection

Append a segment to the collection. Passing a UNA segment means setting/overriding the control characters and setting the serializer to output the Service String Advice. If you wish to change the control characters from the default and not output the Service String Advice, change self.characters instead, without passing a UNA Segment.

Parameters:

segment – The segment to add

classmethod from_file(*args, **kwargs) SegmentCollection

Create a Interchange instance from a file.

Raises FileNotFoundError if filename is not found. :param encoding: an optional string which specifies the encoding. Default is “iso8859-1”. :param file: The full path to a file that contains an EDI message. :rtype: FileSourcableMixin

class pydifact.segmentcollection.UNAHandlingMixin

For backward compatibility

For v0.2 drop this class and move add_segment() to Interchange class.

add_segment(segment: Segment) UNAHandlingMixin

Append a segment to the collection. Passing a UNA segment means setting/overriding the control characters and setting the serializer to output the Service String Advice. If you wish to change the control characters from the default and not output the Service String Advice, change self.characters instead, without passing a UNA Segment.

Parameters:

segment – The segment to add

Parser

class pydifact.parser.Parser(factory: Optional[SegmentFactory] = None, characters: Optional[Characters] = None)

Parse EDI messages into a list of segments.

convert_tokens_to_segments(tokens: list, characters: Characters, with_una: bool = False)

Convert the tokenized message into an array of segments. :param tokens: The tokens that make up the message :param characters: the control characters to use :param with_una: whether the UNA segment should be included :type tokens: list of Token :rtype list of Segment

static get_control_characters(message: str, characters: Characters = None) Optional[Characters]

Read the UNA segment from the passed string and extract/store the control characters from it.

Parameters:
  • message – a valid EDI message string, or UNA segment string, to extract the control characters from.

  • characters – the control characters to use, if none found in the message. Default: “:+,? ‘”

Returns:

the control characters

parse(message: str, characters: Characters = None) Generator[Segment, Any, None]

Parse the message into a list of segments.

Parameters:
  • characters – the control characters to use, if there is no UNA segment present

  • message – The EDI message

Return type:

Segments

class pydifact.segments.Segment(tag: str, *elements: Optional[Union[str, List[str]]])

Represents a low-level segment of an EDI interchange.

This class is used internally. read-world implementations of specialized should subclass Segment and provide the tag and validate attributes.

validate() bool

Segment validation.

The Segment class is part of the lower level interfaces of pydifact. So it assumes that the given parameters are correct, there is no validation done here. However, in segments derived from this class, there should be validation.

Returns:

bool True if given tag and elements are a valid EDIFACT segment, False if not.

class pydifact.segments.SegmentFactory

Factory for producing segments.

static create_segment(name: str, *elements: Union[str, List[str]], validate: bool = True) Segment

Create a new instance of the relevant class type.

Parameters:
  • name – The name of the segment

  • elements – The data elements for this segment

  • validate – bool if True, the created segment is validated before return

class pydifact.segments.SegmentProvider

This is a plugin mount point for Segment plugins which represent a certain EDIFACT Segment.

Classes implementing this PluginMount should provide the following attributes:

validate() bool

Validates the Segment.

Token

class pydifact.token.Token(token_type: Type, value: str)

Represents a block of characters in the message.

This could be content, a data separator (usually +), a component data separator (usually :), or a segment terminator (usually ‘).

class Type(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Serializer

class pydifact.serializer.Serializer(characters: Characters = None)

Serialize a bunch of segments into an EDI message string.

escape(string: Optional[str]) str

Escapes control characters.

Parameters:

string – The string to be escaped

serialize(segments: List[Segment], with_una_header: bool = True, break_lines=False) str

Serialize all the passed segments.

Parameters:
  • segments – A list of segments to serialize

  • with_una_header – includes/adds an UNA header if set to True (=default) If the segments list contains a UNA header, it is taken, else the default character set is created.

  • break_lines – if True, insert line break after each segment terminator.

Tokenizer

class pydifact.tokenizer.Tokenizer

Convert EDI messages into tokens for parsing.

end_of_message() bool

Check if we’ve reached the end of the message

extract_stored_chars() str

Return the previously stored characters and empty the store.

get_next_char() Optional[str]

Get the next character from the message.

get_next_token() Optional[Token]

Get the next token from the message.

get_tokens(message: str, characters: Characters = None) List[Token]

Convert the passed message into tokens. :param characters: the Control Characters to use for tokenizing. If omitted, use a default set. :param message: The EDI message :return: Token[]

is_control_character() bool

Check if the current character is a control character.

read_next_char() None

Read the next character from the message.

If the character is an escape character, set the isEscaped flag to True, get the one after it and store that character in the internal storage.

store_current_char_and_read_next() None

Store the current character and read the next one from the message.

Plugin API

Pydifact provides a framework, where some classes can be extended via plugins. These basically follow Marty Alchin’s Simple Plugin Framework.

The base meta class is a PluginMount:

class pydifact.api.PluginMount(name, bases, attrs)

Generic plugin mount point (= entry point) for pydifact plugins.

Note

Plugins that have an __omitted__ attriute are not added to the list!

SegmentProvider uses PluginMount and can thus be extended with plugins.