Keenan Smith, SEEDEN 2 transition lead staff, told the Student User Group the current process of updating vocabulary is largely manual but that staff are "building tools that will speed up future updates" and expect to use scripting to approach near-real-time updates during a multi-year overlap of systems.
Keenan described common data problems found during alignment: typographical errors (extra spaces), constituents with end-dated parts that required record changes, incompatible part combinations, and several incorrect CAS registry numbers. "We've found several of those, and that's gonna be even more important, come CDN 2 because the CAS registry number is a key field in its table in that database," he said.
Staff said they are tracking longer names that exceed SEEDEN 2 character limits in a crosswalk table to map long and shortened names, and that they will not migrate vocabulary that appears unused or has not been used in many years during the first migration. Keenan and Tessa said the immediate goal is to finish cleaning chemistry- and project-related vocabularies and to reduce future errors by imposing the lower SEEDEN 2 field limits on new vocabulary submitted today.
Why this matters: SEEDEN 2 will enforce key-field constraints (notably CAS numbers) that the current system tolerates. Data submitters who rely on legacy lookup lists should expect some terms to be retired during migration and should verify CAS and name formats in requests.
Staff also apologized to regional data centers for increased coordination emails and said they are trying to send updates in bite-sized batches so as not to overload RDC staff.