Raster GIS data is sometimes provided in "raw" file formats where the importing application, in our case Manifold, must organize the data into usable form based on guidance provided in some accessory file. The data is said to be raw because the file itself provides no organizing structure, only a sequence of bytes that are up to the importing application to interpret. If the accessory file or other accompanying documentation is lost, it can be very difficult to guess how the raw file should be interpreted.
Locate accompany documentation on the file. Read it.
Determine if the raw file contains binary data or text data.
Launch Scan Raw Binary File or Scan Raw Text File
Choose the Data file to be scanned.
Enter information required by the dialog.
Press Scan. If the result is OK press Save to create an rwb file.
Choose File - Link and link to the rwb file.
To import instead of linking, Copy the resulting image and image table from the data source and Paste into the main part of the project.
Raw data files are usually one of two generic types: Raw binary files contain bytes that are to be interpreted as some numeric data type, such as integers or floating point types. Raw text files contain bytes that are to be interpreted as some form of text, where the text characters are then in turn interpreted as numbers or other information.
Manifold provides two tools, one for raw binary files and one for raw text files, which together provide generic import capability from a wide variety of different arrangements used in raw binary or raw text files:
Scan Raw Binary File - Scan a raw binary file using specified options and report the result. If the result seems OK, create a JSON configuration file that Manifold's raw binary (RWB) format dataport can use to link data into the project from the raw binary file.
Scan Raw Text File - Scan a raw text file using specified options and report the result. If the result seems OK, create a JSON configuration file that Manifold's raw text (RWT) format dataport can use to link data into the project from the raw text file.
Manifold Scan Raw tools are used in a two step process:
Step 1: Scan and create a configuration file.
Step 2: Use the configuration file to link data into our project.
First, we first scan the raw file to verify our intended interpretation is correct, and then, second, we use a configuration file created by that scan to import the data. The configuration file is in human-readable JSON format.
Data within raw files can be arranged in various ways, which usually can be described by a relatively small set of options such as the data type, horizontal and vertical size of the grid of pixels and the number of data channels. We learn what options we must use for a particular file by reading documentation that accompanies the raw file. If that documentation provides clear information using sensible terminology, we usually can load a raw file on the first try.
Unfortunately, documentation describing the data within the raw file can use idiosyncratic and nonstandard terms to describe the organization of the raw file or the documentation might fail to provide important information such as data type. In such cases we must try out various possibilities to see what works. Because raw files can be very large and thus it can be inconvenient to apply trial and error to import the entire file, Manifold provides Scan Raw tools which can quickly scan a raw file using a given set of options to see if they work. If the result is obviously wrong, we can adjust the options and try again.
If the result is OK, we can command the tool to create a configuration file in rwb format which captures necessary options to interpret data from a particular file. We then use that configuration file to link the data into Manifold. The two step process gives us the option of trying out various options before committing to what might be a lengthy import.
Rasters are always rectangular arrangements of pixels, within which pixels are arranged in a sequence of rows, with all the rows having the same number of pixels. Rows might be referred to as lines in some metadata documents, with the number of rows being called the height. The number of pixels in a row might be called the length of each row, or the number of columns in that row, or the width of the raster. Data in a raw file is just one long sequence of bytes from beginning to end of the file.
If the raster image is 800 pixels wide by 600 pixels high, the data might be organized so the first 800 bytes are the data for the first row, the next 800 bytes are the data for the second row, and so on. Raw files do not normally contain information within the file on the width and height of the raster image. Instead, we must find that information from any accompanying documentation. If we do not know that the file contains data for an image that is 800 pixels wide by 600 pixels high with one byte per pixel, we will not be able to tell Manifold to use the first 800 bytes for the first row, the next 800 bytes for the next row and so on.
Data File |
Name of the raw binary file. Press the [...] browse button to navigate to the desired folder and to choose the desired file. |
Scan File |
Name automatically constructed by appending .rwb to the name of the raw binary file. Specify a different name if desired. |
Skip bytes |
The number of bytes in the beginning of the file to ignore. Use to skip over header and other non-data information sometimes found in raw files. |
Padding bytes |
The number of bytes to skip after each line. |
Null value |
The numeric value used to represent "no data" in the file for that pixel. For example, the number -9999 is often used to indicate no data for a pixel. |
Type |
Choose the data type represented by the binary data in the file. The accompanying format box allows choice of Intel (little-endian) or Motorola (big endian) style encodings. |
Size |
The horizontal (East/West) and vertical (North/South) dimensions of the image in pixels. Specified as [width, height]. Some people prefer to think of this as [x, y] or as [ (number of columns), (number of rows) ], all of which are the same numbers. |
Channels |
The number of channels in the file. The accompanying box is enabled when the number of channels is greater than 1, and specifies the interleaving, that is, channel order, within the file as follows:
|
Scan |
Scan the data file using specified options. |
Save |
Enabled after a scan. Save the named rwb configuration file based on the specified options. |
Data File |
Name of the raw text file. Press the [...] browse button to navigate to the desired folder and to choose the desired file. |
Scan File |
Name automatically constructed by appending .rwb to the name of the raw binary file. Specify a different name if desired. |
Skip lines |
The number of lines in the beginning of the file to ignore. Use to skip over titles, comments, header and other non-data information sometimes found in raw files. |
Delimiter |
Enter characters (more than one is allowed) to be interpreted as delimiters. White space characters such as space, tab, and end-of-line, are always considered delimiters. |
Null value |
The text string used to represent "no data" in the file for that pixel. For example, the text -9999 is often used to indicate no data for a pixel. |
Type |
Choose the data type represented by numbers in the file. Text representations of numbers have no "Intel" or "Motorla" format, since they are read by Manifold as text. |
Size |
The horizontal (East/West) and vertical (North/South) dimensions of the image in pixels. Specified as [width, height]. Some people prefer to think of this as [x, y] or as [ (number of columns), (number of rows) ], all of which are the same numbers. |
Channels |
The number of channels in the file. The accompanying box is enabled when the number of channels is greater than 1, and specifies the interleaving, that is, channel order, within the file as follows:
|
Scan |
Scan the data file using specified options. |
Save |
Enabled after a scan. Save the named rwb configuration file based on the specified options. |
For a step-by-step example, see the Example: Link NLCD using Scan Raw Binary File topic.
Encodings: The raw binary importer in Release 8 offered eight additional choices of floating point encodings for floating point values in addition to Intel (called IEEE Intel) and Motorola (called IBM MVS), the various additional encodings being for such ancient machines, such as Gould or Data General minicomputers, that sample data for such machines can no longer be found. Manifold therefore in modern times offers a choice of either Intel or Motorola style floating point formats.
File - Create - New Data Source
Assign Initial Coordinate System
Example: Link NLCD using Scan Raw Binary File - Use the Scan Raw Binary File tool to scan and to prepare a configuration file, which we use to link an NLCD raw binary file providing land cover data for Delaware as a raster image. We use a standard palette to color the land cover data and then we assign a projection to the newly imported image so it can be used as a correctly georegistered layer in maps.