Dataset Translations
Translation files for datasets and related metadata in the SOS Data Catalog reside on the SOS machine under the /shared/sos/locale directory. There are three types of files used to translated datasets. Dataset name and description translations are defined in SOS playlists and in tab separated value (tsv) files, while major categories, subcategories, and keywords are defined in comma separated value (csv) files.
Each translation file follows the naming convention “xx_YY.zzz”, where “xx” is the [ISO 639 language code][iso-369], “YY” is the ISO 3166 country code, and “zzz” is the file format extension (sos, tsv, or csv). The locale information is extracted from the playlist filename, so it is critically important to use the correct language and country codes for the language being translated. All the translations for each language/country combination are defined together in the same sos, tsv, and csv files.
It is not required to supply translations for the entire SOS Data Catalog at once for a locale. Any dataset information without translations will remain in English and you can add more translations incrementally at any time.
English Language Overrides
Permalink to English Language OverridesThe default locale is “en_US”, which is English in the United States. However, you are allowed to use en_US.sos, .tsv, and .csv files to make “translations.” While these aren’t technically translations, the definitions specified will override the default English information in the SOS Data Catalog, giving you the opportunity to customize the text there, if desired.
Dataset Translation Playlists
Permalink to Dataset Translation PlaylistsThere are two options for translating the names and descriptions of datasets and their variations. The first of these is using standard SOS playlists (see the TSV Files section to use the other option). Either option works well, but playlists must be used if you have longer descriptions that require multiple paragraphs.
For translations, each dataset or variation is specified by three playlist
properties: an include
property with the dataset playlist path, a rename
property with a translated name, and a description
property with a translated
description on one or more lines enclosed between {{ }}
characters.
# ID 96: Nighttime Lights
include = /shared/sos/media/land/earth_night/nightlights/playlist.sos
rename = 夜間的地球
description = {{這個圖像由國家地球物理資料中心(NGDC)的防禦氣象衛星計畫DMSP)所紀錄下來的。國家地球物理資料中心的地球觀測團隊負責這些數據的研究與建檔,並將可利用的資料加以產品化。資料的蒐集則是利用每日繞行地球二次的極地軌道衛星,衛星有一個掃描線運轉系統,其可見光及近紅外光的(VNIR)感測器在夜間可做低度空間的監測,同時也可感測月光下的雲層、城鎮的燈光、工廠的區位,燃料氣的燃燒閃焰、火光、閃電和極光等。這些夜晚的光線資料都是防禦氣象衛星計畫在1994年10月到1995年3月之間蒐集的數據所建立起來的。
這張特別的圖像只顯示來自電力的光。海洋的部分以深藍色呈現,陸地則是以稍微淺一點的藍色來區分。所有的光都是明亮的白色。經濟繁榮或人口集中的區域通常光線較亮,大部分的海岸線附近也是高亮度地區,可見人們喜歡傍水而居。根據沿岸燈光可以勾勒出非洲尼羅河的輪廓。在美國,東半部地區人口密度比其他地區高。沿著燈光也可辨識重要高速公路之所在。
將全年的資料所合成的影像以及某一個夜晚的資料加以比較,就可以發現電力耗損的情形。全年所累積的光其影像是紅色的,特定一個夜晚的光則呈現綠色,該晚的溫度數據則以藍色呈現,因此雲層看起來是藍的。黃色代表一整年以及那個晚上都有亮光,綠色表示只亮那一晚;只有全年光影的則呈紅色。任何大範圍的紅色有可能就是電力耗損的區域。在卡崔納颶風侵襲過後的2005年8月30日,比較同一個地表不同時間拍攝的兩張影像可以看出災區的範圍;第一張是黑白的燈光影像,第二張有著色的照片則突顯一大片電力中斷的區域。 }}
# ID 44: Air Traffic
include = /shared/sos/media/atmosphere/air_traffic/playlist.sos
rename = 空中交通
description = {{每一天,在美國的天空都有87,000多的航班。其中1/3是航空公司,如西南航空。平均每天,空中交通管制員需要處理28,537的商業航班(其中包括全球跟地區航空公司),27,178通用航班(例如:私人飛機),24548空中租用航班(飛機租用),5,260班次的軍事飛行航班和2,148班次的遞送航班(聯邦快遞, UPS等)。在任何一個時段都有大約5,000航班在美國天空。一年裡,平均有6,400萬次的起飛和著陸。}}
An example of part of a zh_TW.sos
playlist used to translate
Traditional Chinese in Taiwan for several datasets.
Translating Dataset Playlists for a New Language
Permalink to Translating Dataset Playlists for a New Language- Generate playlist files from the SOS dataset catalog using
translations2db --generate_playlist
(see the translations2db Command Line Utility section for details) - Copy /shared/sos/locale/generated/en_US.sos from to
/shared/sos/locale/xx_YY.sos, following the
xx_YY.sos
locale naming convention for the language and country for which you want to create a translation - Replace the English values (to the right of the equals sign) for the
rename and description keywords with translated values in a Linux text
editor, such as
vi
orgedit
. Be sure the description text is enclosed between{{
and}}
characters - Reload your dataset translations into the SOS Data Catalog using either
the SOS Stream GUI or
translations2db --load_playlists
(see the translations2db Command Line Utility section for details)
Editing Dataset Playlist Translations for an Existing Language
Permalink to Editing Dataset Playlist Translations for an Existing LanguageEditing a translation (or English override text) is done by simply modifying
the values to the right of the equals sign of either rename or description
keywords for any datasets in a translation playlist with your favorite text
editor. Be sure the description text is enclosed between {{
and
}}
characters. Should you wish to remove a particular translation entirely,
just delete the lines containing include, rename, and description for any
datasets you no longer want.
TSV Files
Permalink to TSV FilesThe second option for translating the names and descriptions of datasets and their variations is with tab separated value (tsv) files. Either option works well, but tsv files have the advantage of being easily loaded into a standard spreadsheet for convenient editing. For translations, each dataset or variation is specified by four columns: A is the dataset ID number in the Data Catalog, B is the playlist file path, C is the dataset name, and D is the dataset description.
One highly efficient way to use this option is to upload the dataset tsv file you want to translate into a Google Sheet using Google Drive and then use the Google Translate formula to automatically translate all the text.

An example of part of a zh_TW.tsv file imported to a Google Sheet, with
translations of dataset names from column C to Traditional Chinese in column C
using the GOOGLETRANSLATE
function.
To complete the translation for this example, copy column C over column B (using Paste special > Paste values only), delete column C, then repeat these same steps for the the dataset descriptions. Once the translations are all pasted as values they may be edited by hand to fix errors and improve the quality of the translations.

The dataset TSV after translation is completed.
The final steps are to download this as a tsv file, copying it into /shared/sos/locale/zh_TW.tsv (replacing the original file there).
Reload your dataset translations into the SOS Data Catalog using
translations2db --load_dataset_tsv
(see the translations2db Command Line
Utility section for more details).
Highlights (Spotlight) Translations
Permalink to Highlights (Spotlight) TranslationsHighlight (spotlight) datasets were added in SOS 5.3 and are a small set of SOS datasets periodically selected by the NOAA SOS Team in Boulder. They provide an easy way to gradually explore the SOS Data Catalog over time and enable quick access to datasets that highlight timely events. Translations for these datasets are specified by a combination of normal dataset translations made in .sos or .tsv files and additional definitions in highlights dataset (.hds) files, which have a similar structure to a playlist. Note that highlights datasets are also associated with Highlights Categories used to classify a particular kind of highlights dataset. These are translated separately in the same csv files used for other dataset categories and keyword (see the following section).
For translations, each highlights dataset or variation is specified by several
properties: an include property with the dataset playlist path, a
TemporaryDatasetName property with a translated name (only used for temporary
datasets), and a WhyDescription property with a translated description of why
the dataset is being highlighted on one or more lines enclosed between {{ }}
characters. Note that the dataset name is not translated in the
hds file unless it is temporary (i.e., not in the SOS data catalog).
Regular dataset name translations are already made using .sos or
.tsv files (see the previous section).
# ID temporary
include = /shared/sos/media/temporary/text_pips_example/playlist.sos
TemporaryDatasetName = 我的臨時數據集
WhyDescription = {{<p>此數據集強調了2017年4月SOS版本5.2中發布的稱為Text PIPs的SOS新功能。</p>
文本PIP允許在球體上靈活顯示文本。 功能包括顏色選擇,字體樣式,背景顏色,不透明度,以及支持多種語言,如中文和西班牙文。</p>
<p>使用SOS Visual Playlist Editor快速為您的數據集創建文本!</p>}}
# ID 3 (variation)
include = /shared/sos/media/atmosphere/2005_hurricane/grayir/playlist_audio_rita.sos
WhyDescription = {{<p>測試數據集變體。</p>}}
# ID 56
include = /shared/sos/rt/noaa/sat/enhanced/playlist/playlist.sos
WhyDescription = {{<p>氣象學家使用紅外衛星圖像來確定雲層的位置,更重要的是雲層是如何移動的。</p> <p> <b>以粗體測試文本。</b> </p>}}
# ID 82
include = /shared/sos/media/land/blue_marble/blue_marble/playlist.sos
WhyDescription = {{<p>測試其可見性設置為0的亮點數據集。</p>}}
An example of part of a zh_TW.hds file used to translate Traditional Chinese in Taiwan for several highlights datasets.
Translating Highlights (Spotlight) Datasets for a New Language
Permalink to Translating Highlights (Spotlight) Datasets for a New Language- Generate hds files from the SOS dataset catalog using
translations2db --generate_highlights
(see the translations2db Command Line Utility section for details) - Copy /shared/sos/locale/generated/en_US.hds from to
/shared/sos/locale/xx_YY.hds, following the
xx_YY.sos
locale naming convention for the language and country for which you want to create a translation. - Replace the English values (to the right of the equals sign) for the
TemporaryDatasetName and WhyDescription keywords with translated values in
a Linux text editor, such as
vi
orgedit
. Be sure theWhyDescription
text is enclosed between{{
and}}
characters - Reload your dataset translations into the SOS Data Catalog using either
the SOS Stream GUI or
translations2db --load_highlights
(see the translations2db Command Line Utility section for details)
Editing Highlights (Spotlight) Dataset Translations for an Existing Language
Permalink to Editing Highlights (Spotlight) Dataset Translations for an Existing LanguageEditing a translation (or English override text) is done by simply modifying
the values to the right of the equals sign of either TemporaryDatasetName
or
WhyDescription
keywords for any datasets in an hds file with your favorite
text editor. Be sure the WhyDescription
text is enclosed between
{{
and }}
characters. Should you wish to remove a particular
translation entirely, just delete the lines containing include
,
TemporaryDatasetName
, and WhyDescription
for any highlights datasets you no
longer want.
Dataset Category and Keyword Translations
Permalink to Dataset Category and Keyword TranslationsThe SOS Data Catalog uses categories and keywords to organize and provide
searching capabilities for the hundreds of datasets it holds. Each dataset has
at least one “Major Category” and “Subcategory” to classify it and usually has
one or more “Keywords” pertaining to its content. Each highlights (spotlight)
dataset also has a “Highlights Category”. These metadata entities are localized
using comma separated value (csv) files. Csv is a common import/export format
for spreadsheets, such as Excel or Google Sheets. The csv files follow the
naming convention xx_YY.csv
, where “xx” is the [ISO 639][iso-369] language
code and YY is the ISO 3166 country code. The default locale is
en_US, which is English in the United States. However, if an en_US.csv file is
present, it will not be imported into the SOS Data Catalog since American
English values are already defined by default.
Each row in the csv file includes the type of metadata, the English text, and the translated text.
MajorCategory,Water,Agua
MajorCategory,Land,Tierra
MajorCategory,Space,Espacio
MajorCategory,Extras,Extras
MajorCategory,People,Gente
MajorCategory,Snow and Ice,Nieve y Hielo
MajorCategory,Air,Aire
MajorCategory,Site-Custom,Sitio-Aduana
SubCategory,Temperature,Temperatura
SubCategory,Real-time Weather Models,En tiempo real Tiempo Modelos
SubCategory,Health,Salud
Keyword,Phases,Fases
Keyword,Density,Densidad
Keyword,Sulfate,Sulfato
HighlightsCategory,Current Event,Evento actual
HighlightsCategory,Dataset of the Month,Conjunto de datos del mes
HighlightsCategory,Latest Dataset,Último conjunto de datos
An example showing portions of a Google Translated es_MX.csv file for Mexican Spanish.
Translating Categories and Keywords for a New Language
Permalink to Translating Categories and Keywords for a New Language- Create an en_US.csv file (see the translations2db Command Line Utility section for details)
- Rename it to the correct locale name following the
xx_YY.csv
naming convention - Load it into a spreadsheet program (or a text editor if preferred)
- Replace the last column of English text with translated values. Do not modify the text in the first two columns
- Export back to csv (if using a spreadsheet). Be sure the
xx_YY.csv
file is placed in the /shared/sos/locale/ directory - Load the
xx_YY.csv
file into the SOS Data Catalog using either the SOS Stream GUI or translations2db command line utility
Editing Category and Keyword Translations for an Existing Language
Permalink to Editing Category and Keyword Translations for an Existing Language- Load an existing
xx_YY.csv
file into a spreadsheet program (or a text editor if preferred) - Update the last column of translated text with your edits. Do not modify the text in the first two columns
- Export back to csv (if using a spreadsheet). Be sure the
xx_YY.csv
file is placed in the /shared/sos/locale/ directory - Reload the
xx_YY.csv
file into the SOS Data Catalog using either the SOS Stream GUI or translations2db command line utility