Community models: Product description text shortener

Community Models



This model is an automatic shortener for material descriptions and product titles. The use case is an ERP migration of AS400 to SAP S/4 HAN, requiring shorter material descriptions.

Problem statement:

  • In the current system AS400, the material descriptions are not limited in length. Therefore, a lot of information is stored in the description field.
  • In the new system SAP S/4 HANA, the Basic Text in the Material Description can only be 40 characters in length. This company works with scanners in logistics, which means that in practice the length is reduced even further to 38 characters.
  • In total there are about 30.000 materials in the database, so reducing them manually is not an option. Also, the company wanted to have a sustainable solution for all future material descriptions.
  • Since the production and logistics department are used to working with descriptions, we needed to find a way to keep as much information as possible in the basic text to help them identify the material. Additional information can then be stored in the Long Text field but will not be displayed on the scanners, which means that they can only be accessed from their computers. In other words, the better the quality of the basic text, the more efficient the departments can work.
  • The challenge was to come up with a consistent way of abbreviating all words, across the four languages used in the company – English, German, French and Dutch.

Solution:

  • In Python, BOLD.digital created a shortener API, with the following assumptions:
    • They used a rule-based approach.
    • They used a package in Python to divide all words in syllables (Pyphen).
    • Words shorter than 5 letters are not abbreviated. Keep in mind that they are often monosyllabic, reducing them in lengths makes it difficult to read. Examples are Round, Tube, Wired etc.
    • For German, the first syllable is kept intact, of every following syllable the first letter is kept and the final letter. This follows the logic of removing the vowels but keeping the most important consonants intact, which is based on a process called ‘disemvowelling’. In our tests, we found that keeping the first syllable intact increased the readability.
    • BOLD adjusted the rules slightly per language. For example, in English many words in the material database ended with a ‘e’ (e.g. Spare, Frame), so in English we select the next to last letter in the final syllable.
    • Some of the words already had abbreviations that were used company wide. We added these to the glossary.
      • This means that it checks the glossary first to see if the word already has an abbreviation assigned to it, if not it applies the rule. Have a look at the image for the workflow.
    • For BOLD’s client, they added Regex Expressions to reorder the product descriptions, improving readability. For example, now all descriptions start with Product Group.
    • BOLD created a Python package of this shortener. Python Package ‘Abbreviator 0.0.4’

In the language field, enter either ‘en’, ‘de’, ‘fr’ or ‘nl’ (any unrecognized language input will default to english)

In the long field, enter any text or product description to be shortened.

Returned will be the original text and its shortened versionThis model is intended for demonstration and testing purposes only. UbiOps is not liable for any damages arising from the use or inability to use any of the models and applications listed on the UbiOps Community Model pages. Even though UbiOps and our partners carefully created and optimized these models, it is always advised to benchmark and check the respective functionality before applying it in any production setting.Created: 20-11-2021

Last modified: 30-11-2021

Publisher: BOLD

This abbreviation model by BOLD makes a unique shortened version of a text or product name.

Are you interested in this specific model? You can reach out to BOLD or Stephanie Wagenaar (Lead Digital, BOLD) for more information.