-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Align to the production UN Vocab #536
Comments
Current draft version: https://service.unece.org/trade/uncefact/vocabulary/uncefact/ |
@VladimirAlexiev, @brownoxford, FYI ^ |
@nissimsan any updates? |
Yes, it's basically done. we're currently waiting for the This is what v1 will look like, though: dmvc7xzscpizo.cloudfront.net |
The much improved production vocabulary.uncefact.org is live now. We should switch our pointers from the draft URIs to this. |
We did this! |
@nissimsan - re "We did this!", I'm wondering what exactly we did. I've just been looking at https://vocabulary.uncefact.org/UnitMeasureCode Hyperlinks such as https://vocabulary.uncefact.org/UnitMeasureCode#KGM go nowhere / provide no further details and I don't see any details about conversion factors when I check the source code for the page. I also tried reloading the page after setting the HTTP header Accept: application/ld+json but that just produced a 404 error page with this rather unfriendly message: 404 Not Found Conversion factors are still present in the older JSON-LD file at https://service.unece.org/trade/uncefact/vocabulary/rec20.jsonld However that does not use the 2-3 character alphanumeric codes for its In comparison, https://qudt.org/vocab/unit/KiloGM provides plenty of data about kilograms and a triple that links via qudt:uneceCommonCode to "KGM" and it would be even better if each UN ECE Rec20 unit code had a corresponding URI such as https://vocabulary.uncefact.org/UnitMeasureCode/KGM that provided similar information about conversion factors, so that QUDT could link to such a Web URI within https://vocabulary.uncefact.org rather than a dumb string such as "KGM". |
Hi @mgh128, What we did was switch from the draft to production UN/CEFACT term definitions. (#726) So we now reference for example https://vocabulary.uncefact.org/consigneeParty. Good catch that the conversion factors are now missing from https://vocabulary.uncefact.org/UnitMeasureCode#KGM. Clearly that data has been available, so we must have dropped it along the way. Note that this is work done on the UN side, not on this repo. I will bring it up with the team - your clear requirements is a great help. I agree completely that the QUDT should link with a real URI. Might that be something you can bring up there, changing KGM to https://vocabulary.uncefact.org/UnitMeasureCode#KGM? |
Hi @nissimsan Many thanks in advance for alerting the UN CEFACT team about the missing conversion factors. Unlike many other code lists, the code list(s) for unit of measure do require more than a code value and a description - so either there should be more 'columns' in the displayed table - or clicking on a link such as https://vocabulary.uncefact.org/UnitMeasureCode#KGM would result in a different page view with further details (including conversion factors) or perhaps expand an 'accordion' (e.g. using HTML I also noticed that within the list of code lists at https://vocabulary.uncefact.org/code-lists there is not only the main unit of measure code list https://vocabulary.uncefact.org/UnitMeasureCode but also some additional code lists such as: unece:AirFlowUnitMeasureCode Unfortunately, this means that a unit of measure such as KGM for kilogram now appears in more than one code list, e.g. unece:WeightUnitMeasureCode#KGM or unece:LinearUnitMeasureCode#MTR Furthermore, the specialised code lists for unit of measure (e.g. https://vocabulary.uncefact.org/LinearUnitMeasureCode ) do not contain the complete set of units for that dimension or type of measurement. Similarly, https://vocabulary.uncefact.org/TemperatureUnitMeasureCode includes code values for degree Celsius (CEL) and degree Fahrenheit (FAH) but does not even use the SI base unit - kelvin (KEL), which only appears in the main code list as unece:UnitMeasureCode#KEL I hope that you can also raise this issue with the UN CEFACT team. Of course I'd be happy to discuss with QUDT folks and prepare a pull request when we've agreed what the QUDT property should be for pointing to corresponding Web URIs based on https://vocabulary.uncefact.org/UnitMeasureCode as a URI stem - but before I spend any time on that, I'd want to see https://vocabulary.uncefact.org/UnitMeasureCode updated to show the conversion factors that are already present in the older dataset at https://service.unece.org/trade/uncefact/vocabulary/rec20.jsonld If I could actually see the RDF dataset behind https://vocabulary.uncefact.org/UnitMeasureCode then I could (1) easily detect whether the conversion factors are missing from the dataset or just not shown in the user interface and (2) offer to add the potentially missing triples for conversion factors to the dataset (using a SPARQL query using that dataset and https://service.unece.org/trade/uncefact/vocabulary/rec20.jsonld as the data sources). We certainly appreciate the efforts of your team and the UN CEFACT team in making the code list for units of measure finally available as Linked Data rather than just an Excel spreadsheet and with the suggested improvements noted above, I think it will be a useful resource for everyone, including everyone in the GS1 community. |
Excellent @mgh128 - cheers! We can confirm the conversion factors were missing as we switched from Excel to a newer JSON Schema data source. The issue above is the first step, getting it included from upstream. The term duplication you point out is the result of how the source data is modeled; using endless extensions rather than inheritance. This has been the main challenge of the project, there were no way around case-by-case decisions and rules. Tagging @kshychko ref. conversation on slack yesterday. There are two things here: a) adding conversions, b) fixing duplicates. The former has a dependency and is IMO most critical as we need those conversions no matter how and when we might change modeling in the future. @mgh128, zooming out, I can help pondering if the world actually needs two code lists. UN/CEFACT has traditionally liberally defined everything. In the modern world, this has led to significant term duplication which is an anti-pattern (my opinion). Units seems like another case of this, and as much as I love and am proud of the QUDT-UN cross-linking I feel like in an ideal world the UN would just adopt all QUDT's terms where there is overlap. I'm curious if you see any arguments against this - is there a reason why the world needs both? And how should I be thinking about choosing a QUDT over UN unit URI? |
Hi @nissimsan Regarding two systems for units of measure, I'd note that the UN CEFACT Rec20 unit codes are widely referenced throughout GS1 standards, so as a result, they are widely used in EDI messages, traceability data and master data, at least in the fast moving consumer goods sector and other industry sectors that GS1 supports, including healthcare, apparel and technical industries. Having said that, in addition to UN/CEFACT Rec20 and QUDT, there is also UCUM - Unified Code for Units Of Measure ( https://ucum.org/ucum ). Unlike both QUDT and UN/CEFACT Rec20, it attempts to take a highly systematic approach to how its unit codes are created, rather than making choices that appear to be somewhat arbitrary. However, not all UCUM unit codes are URI-friendly, especially when using square brackets for units outside the SI system and forward-slash even in SI units such as m/s (metres per second), so that's a downside for a semantic ontology of units of measure and as far as I'm aware, UCUM is not yet published as a semantic ontology or Web vocabulary, whereas QUDT and UN/CEFACT Rec20 are. There is a corresponding dataset for UCUM - see https://ucum.nlm.nih.gov/ucum-lhc/ I'm fairly sure that QUDT provides links to UCUM unit codes but unfortunately only as string values because UCUM doesn't publish a Linked Data ontology as far as I know. I am aware that in some cases, UCUM code values have been used within GS1's GDSN data model to fill in gaps in the coverage offered by UN/CEFACT Rec20 unit codes. I'm not convinced that using two distinct UoM code lists to populate a single unitCode property is good practice but they didn't ask my advice before taking that decision! |
Argh - QUDT!! 🤦♂️ ... The other four letter acronym with a Q in it! Updated. |
@nissimsan - yes, QUDT, not to be confused with the much older system SPQR which definitely didn't publish a Linked Data ontology ;-) |
The UN CEFACT LD vocab should be bumped to version 1.0, expected during fall of 2022.
The main updates which will be required to be update on our side include:
/#
(see 286 and 531).I advise that we don't start this work until the UN vocab is published.
The text was updated successfully, but these errors were encountered: