[2023-10-30] Introducing Processing Logs: Enhancing Transparency in Semantic Element Transformations #37
Elijas
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Context
We're excited to introduce a new feature in our
sec-parser
project - the Processing Log for each Semantic Element. This feature is designed to enhance the transparency and traceability of our parsing process, making it easier for contributors and users to understand how each element is transformed.In the realm of HTML parsing, especially within SEC EDGAR documents, a semantic element refers to a meaningful unit within the document that serves a specific purpose. As we parse these documents, we transform these semantic elements, and it's crucial to track these transformations for clarity and debugging purposes.
This is where the Processing Log comes into play. Each semantic element now has an associated
processing_log
that records various stages during its transformation. The log captures the origin of the transformation and a message detailing what occurred during that transformation.Example
Here's an example. Notice that before returning the answer, the reasoning that led to a decision is logged (source):
Usage
You can view the logs in various ways, including but not limited to:
element.processing_log
variable for anySemanticElement
Let's see the journey of how the element with the text
"ITEM 2. UNREGISTERED SALES"
was identified asTopLevelSectionTitle
.NotYetClassifiedElement
. It's implied, so we don't see this initial state.SemanticElement
was classified asTextElement
byTextClassifier
.HighlightedTextElement
byHighlightedTextClassifier
. We can also see the contents of theHighlightedTextClassifier.text_style
field.TopLevelSectionTitle
byTopLevelSectionTitleClassifier
. We can can also see the contents of theTopLevelSectionTitle.level
andTopLevelSectionTitle.identifier
fields.Summary
This feature is part of our ongoing commitment to creating a robust, efficient, transparent open-source project. The Processing Log will significantly improve the maintainability of our code and make it easier for contributors to understand the transformation process.
As always, we welcome your feedback and suggestions to enhance the sec-parser project further. Thank you for your continued support and involvement in the sec-parser, sec-ai, and Alphanome.AI community. Stay tuned for more updates!
Beta Was this translation helpful? Give feedback.
All reactions