AES31 - Eliminating the File
Written by Mel Lambert for the October 2001 issue of "MIX."
Analog was easy. A reel of 2-inch 24-track tape could be played back in virtually any studio or post facility, anywhere in the world. However, these days the situation is far more confusing. Since manufacturers of disk-based recording systems developed their proprietary approaches to speed and flexibility, we are now faced with an array of data-storage formats and media choices. And, if our aim is to integrate digital recorders and workstations from a number of different vendors - and there are compelling reasons why we might want to do that - how can we eliminate this Tower of Babel?
The most likely solution is in the form of AES31. For several years, the Audio Engineering Society Standards Committee (AESSC) Working Group on Audio-File Transfer and Exchange, under the chairmanship of Mark Yonge, former SSL Digital Product Manager, and now AES Standards Manager, has been refining the various elements of what has emerged as a viable technique for transferring sound files and project data from one workstation or recorder to another. The Working Group's initial task was to facilitate audio interchange in production and post-production (with or without synchronized picture); distribution and archiving formats are being considered separately. AES31 provides a set of technical specifications that, when implemented in a workstation, allow disk drives, digital audio media and Edit Decision Lists to be transferred from one AES31-compliant workstation system to another.
"The AESSC Working Group was set up in reaction to the audio industry asking for simple project interchange," says Yonge. "We had seen what was happening with OMF, but thought that something simpler might also have a place in the data-exchange landscape. So we developed a four-tier approach to the problem." These four independent stages form a series of scalable modules with interchange options, to produce a multipart standard. Applications range from the simple interchange of a single sound file to complex projects involving fine editing of many source sounds. "The interchange method needs to be flexible enough to support all these needs at a level of complexity appropriate for the task," says Yonge.
AES31: THE FOUR INGREDIENTS
>>AES31-1 is concerned with Physical Data Transport, how files can be moved from one system to another - either via removable media or (later) a high-speed network. Basically, AES31-1 specifies a transport compatible with Microsoft's FAT32 structures, although, for copyright reasons, it doesn't actually name Microsoft or quote its proprietary specifications.
>> AES31-2 focuses on Audio File Format, how the data in BWF or Broadcast Wave chunks should be arranged on the removable media or packaged for network transfer.
>> AES31-3 describes a Simple Project Structure, using a sample-accurate Audio Decision List, or ADL.
>> The more complex AES31-4 Object-oriented Project Structure could use an extensible object model capable of describing a much wider range of parameters for applications where the costs of significant additional complexity can be justified. (More on this later.)
Stages 1 and 3 have been published as standards, with stage 2 anticipated shortly.
BWF is a standard developed in part by the European Broadcasting Union (EBU) based on conventional IBM/Microsoft RIFF/Wave audio files. An additional header chunk defines the format of the audio data, and includes a description of the sound sequence, the name of the originator, a reference of the originator, the origination and time, plus a time reference. In essence, BWF time-stamps each audio file with its proper location in a project and adds useful identification information: The format-independent time reference is a 64-bit number representing the first sample in the file as sample count since midnight, and can be used with any timecode or picture frame-rate and with any current or future sampling rate. For simple review, a BWF can be played on any system capable of playing a Wave file.
"AES31's ADL was modeled on conventional Edit Decision Lists," Yonge continues, "but with sample-accurate granularity or precision compared with that offered by PAL/NTSC video synchronization. And we wanted to include specific parameters for multiple audio channels, cross fades, level automation and other important values."
Brooks Harris, vice-chairman of AESSC SC-06-01, and president of Brooks Harris Film & Tape, New York, is a genius at sorting out EDLs and ADLs. "Aside from being 'human readable' and easily understood, AES31-3 is based on two important parameters: sample accuracy and file locators. We need to label in/out points of the component audio files in H:M:S:F and sample count."
AES31-3 uses a form of universal resource locator for accessing files on any platform or network. This includes the "file" (URL scheme designator), followed by the "host" name, the names of the local disk volume, directory, subdirectories, and then the file name with a .WAV extension; all branches in the sequence are separated by conventional forward-slashes used in familiar http:// and ftp:// addresses.
In terms of timing accuracy, Harris spent a long while developing highly accurate algorithms that can mathematically accommodate any time base for the source and destination projects. "AES31 is the first set of analytic definitions that could be agreed upon by manufacturers to control data interchange," he says. "We have developed a technique for accurately defining the precise location of each sample of audio data, by combining conventional timecode data with a frame rate that lets the receiving DAW, for example, know precisely where the data is located and how to play out the elements in perfect synchronism." The AES31-3 ADL contains information about what files are to be played at which location in the timeline. It specifies the frame rate and time base; the film frame (A, B, C or D for 4-perf) and whether the time base is drop or non-drop (NTSC-only; PAL video runs at a fixed 25 fps rate). Also included in each BWF sample are the central sample rate - 32, 44.1, 48, 88.2 and 96 kHz - and one of five pull-up/down ratios, to provide multiple, unambiguous combinations that allow the project to be re-assembled in perfect sync.
AES31-4 is currently undefined except for its intent, but could be based, Harris says, upon current deliberations by the industry consortium that is promulgating Advanced Authoring Format, or AAF. Because AAF is intended to function as a multimedia file format that enables content creators to easily exchange digital media and metadata across platforms, and between systems and applications, it will include complex project structures that enable sample-accurate editing of multiple sources. The AAF Association's membership includes Avid, BBC, CNN, Discreet, Fox, Grass Valley Group, Liberty Livewire, Microsoft, Omneon, Panasonic, Philips Pinnacle, Quantel, Sony and Warner Bros.
"Because AAF is being developed by a consortium and not a standards body," Harris explains, "it can have a wider scope. But the AES standards working group has an active liaison project with the AAF Association. AAF could be a 'super set' of AES31, with many common elements." Harris chairs this liaison group.
One of the main reasons DAW manufacturers cite for adopting an open standard - rather than attempting to emulate each other's formats - is that it only needs to be done once. An "anything-to-anything" solution can be extremely complex and costly to manage; AES31 compatibility is intended to reduce such effort.
As Andrew Brent, Fairlight USA's international technical director, explains: "We currently support, in Version 15.6 and beyond, BWF import/export, and from 16.5 on we will be supporting FAT32 disk-file format. Our remaining development is the interpretation and creation of AES31 ADL."
In terms of the future, Brent feels that AES31 should be kept simple. "There is a lot of discussion about adding object-oriented events to the ADL. But the current ADL is a text-based, readable format. Any code developed to support complex algorithms such as real-time EQ, with the corresponding DSP code, or dynamic level and cross fades, or the ability to create a 'Takes Layered' environment on one track, will create an ADL that will explode in size. The best use of AES31 is a simple interchange of raw audio between these systems."
Howard Schwartz, president of New York's Howard Schwartz Recording, feels that "AES31 has a long way to go. The installed base for OMF is so huge that the switch will be slow. Even if the format may be better, it isn't easier to totally switch away from OMF. We install whatever our clients request, after the second or third asking. OMF and Pro Tools were the first really significant technical requests coming from clients, not other facilities. I am sure Avid, as has Microsoft, will do whatever they can to preserve their dominance in the file wars. Does AES31 work side by side with OMF? Together? On top of; next to? A whole retraining must take place, and for what gain?"
But Jay Palmer from Universal Studios' post-production sound department - and a guiding light behind Hollywood Technical Audio Committee's efforts toward file-format standardization - feels differently. "AES31 compliance is very important. For TV editing and mixing of shows done in-house [at Universal], native Pro Tools is the usual choice. If the project comes from outside, there are a variety of ever-changing file formats. In these cases, all bets are off as the new file revisions of their platforms are not always supported by our equipment. A simple AES31 export would help solve this dilemma. The same applies to feature-film editing and mixing. There is a larger commitment to archiving the elements, predub mixes, stem-master mixes and print masters. Archiving in an industry standard would be much more beneficial in the long run.
"And for DVD mastering, when film AB reels are conformed for continuous DVD playout, prior to AC3 compression and streaming, the files are non-compressed. These become the new DVD non-compressed audio masters, and will be used as network TV needs them and for future formats. It would be good to have a standardized format for this new archival master. Archiving is all about maintenance of the content owners' intellectual property - important stuff. You do not want to archive to a proprietary, ever-changing file format; we need an industry-recognized, non-proprietary, AES-badged standard."
In terms of what might be missing from AES31, Palmer cites such issues as conversion to/from disparate disk file systems, including Mac HFS and HFS+, PC FAT16, FAT32, Linux and BeOs. "The AES-31 format states ADL, BWF and FAT32. Period," he says. "Will folks have to drastically alter their workflow habits to support the standard to the letter of its spec? Will programmers be able to imbed all of the disk utilities necessary to implement the standard?
"While AAF builds upon OMF as its container/EDL," Palmer continues, "it is controlled by the AAF Association, a trade association with dues-paying members. AAF is very all-encompassing; it has wide-ranging standards descriptions for audio, video and metadata containing a complete edit history. But a main point of contention for many manufacturers is that they do not want to support a standard that is wholly owned by another competitor. OMF, for example, is owned by Avid, and OpenTL by TimeLine Vista. AES31 is a 100 percent truly-open standard that is relatively to implement."
"Our industry is based on workflow of projects and media through the production chain," adds Ron Franklin, president of WaveFrame, and a member of AESSC SC-06-01. "Anything that facilitates workflow is beneficial to both our customers and our company. We believe AES31 will prove to be a very important standard. First, because it works. And second, because it is the only digital audio project-file interchange scheme officially ratified by a standards body that contributes to ISO standards."
WaveFrame systems can read and write to various disk formats, including FAT16, FAT32, NTFS, and Mac HFS and HFS+. "OpenTL is the file format used in the Tascam/TimeLine MX-2424 and MMR dubbers," says Franklin. "We have implemented support for this in WaveFrame/7 since OpenTL is currently the only format the MX-2424 supports; we wanted to provide our users with file and project compatibility with this machine."
Andy Morris is president of Buzzy's Music, an L.A.-based post and voice-over facility. "A user-defined, AES-developed, non-proprietary standard will go a long way to breaking down the industry's data-exchange log jam," he says. "Currently, we provide clients with BWF files from all our MFX3+ systems. Contrast that technique with using a tedious method of transferring eight tracks at a time to a DA-88, or two tracks at a time to DAT - with or without timecode - and later re-aligning these multiple passes in the receiving DAW. It's a real-time process: to transfer 10 spots to DAT, each 60 seconds long, would take over four hours for upload and download. Oh, and did I forget to mention that the client is not willing to pay for all that wasteful transfer time? AES31 can handle that in a matter of minutes; it's a no-brainer."
Jim DeFilippis is VP of television engineering for Fox Studios' New Technology Group, and heads up a task force for AAF. "We are working with the AES on defining the interoperability between AES31 and AAF. So far, we have determined that AES31, while suitable for many audio post applications, does not have the critical element that AAF seeks to provide: a complete history of the file essence, including all the post-production metadata.
"The goal of AAF is to allow post projects to flow from workstation to workstation and allow seamless transitions. AES31 stops short. Each time projects are transferred from workstation to workstation, the critical metadata has to be re-done or transferred via a different media (paper, floppy disk, tape label). This is somewhat inefficient and impedes the collaborative effort between departments."
SADiE's managing director, Joe Bull, also serves on AESSC SC-06-01, and says that "without AES31, the audio industry has to accept the 'one-size-fits-all' workstation philosophy. It's like a carpenter having only a hammer to fashion his creations; he needs a range of tools to achieve the job. Similarly, synching dailies, ADR, track laying, Foley, music editing, mixing, etc., are best handled using the most appropriate tools. AES31 implementation will allow the user to choose.
"With the first three parts of AES31 ratified," Bull says, "the only remaining part is mix automation; we always recognized that this was the trickiest part. So, rather than delay any form of interchange until every bell and whistle had been covered, the AES31 Working Group decided that a gradual approach would give the industry a working toolbox, with the basics initially and expandable in the future, and provide the best route forward."
SADiE is currently working to provide the first level of Part 4, level and mute automation. "We are prototyping some ideas, along with Brooks Harris and others, to see what can be sensibly achieved in a reasonable timescale," says Bull. "In the meantime, the industry has the basic toolbox that at least puts the right bit of audio in the right place in the EDL."
Bull considers that AAF, on paper at least, could provide everything that the industry needs. "However, until it's ratified, AAF provides absolutely nothing for anybody," he stresses. "There have been discussions that, to speed their deliberations, AAF may adopt parts of AES31. Even when they do finish and publish a complete working standard, there may still be problems. AAF is an object-oriented and thereby a complex, structured product handling everything from graphics, video, audio, etc. This may be fine for a video manufacturer with huge software resources, but could be beyond the scope of what audio-specific manufacturers can afford to develop."
"We're a very customer-driven company," notes Scott Dailey, Digidesign's VP of product marketing & business development. "Thus far, the vast majority of our customers have been asking us for other important things. Based on this feedback, we believe AES31 Level 4 is potentially very exciting and important to Digidesign and its customers, whereas Level 3 is probably not."
Dailey says that Digidesign and its parent, Avid Technology, have been delivering open, cross-application, cross-platform media and metadata interchange for several years, based on OMF. "OMF currently does everything AES31 Level 3 does, and more. It is mature, reliable technology, and thousands of post-production professionals around the world rely on it every day to get their work done. Further, OMF has been adopted by essentially every important company in post-production audio, as well as most - if not all - of the important film and video post companies. Given OMF's widespread adoption by end-users, and its broad adoption by manufacturers, AES31 Level 3 looks like a lot of work for a big step sideways."
It is important to note, Dailey says, that no company can reliably support a wide variety of interchange standards. "Our test grid is staggering already, and supporting a variety of legacy Pro Tools Session file formats, plus EDL, OMF, AAF, OpenTL, ADL, etc., is impractical. Digidesign must pick and choose the highest impact standards. Our highest priorities right now are OMF and AAF, because they encompass all the capabilities of the other standards. Not that AES31 isn't valuable; it's just a lower priority than other interchange formats and general Pro Tools features customers are clamoring for."
"FAT32 as the only supported file system is not a very good choice for manufacturers who build systems based on the Apple Macintosh," he says. "Our engineers have told me that FAT32 support is not physically impossible, but that it would require an amount of work that is greatly disproportional to the limited interest our customers have shown in AES31 Level 3 thus far. The other large unresolved issue is closure on AES31 Level 4."
"OMF and AAF offer several interesting possibilities for rich interchange that are not addressed by AES31 Level 3," Dailey adds, "including 'bread and butter' functions such as routing, volume and multichannel panning automation, and more advanced operations such as hardware and software plug-in processing parameters. Other compelling possibilities include tracking and management of historical metadata that would define the series of processes applied to a piece of audio or other compositional data, thus providing a roadmap from the original source material to the material in its current state. Users could rebuild a sound - or even an entire mix - from the original elements, perhaps omitting or changing certain processes that were applied somewhere along the way."
Workstations and Digital Recorders currently offering or planning to offer AES31 Compliance