Marina Bosi Interview

INSIGHTS: Interview with Dr. Marina Bosi

VP of Technology, Standards, and Strategies at Digital Theater
Systems, and president-elect of Audio Engineering Society

Interviewed by Mel Lambert in August 1998

In many ways, Marina Bosi defies simple definition. At heart an avowed academic, with a list of research credentials a mile long - as well as a confirmed supporter of the Audio Engineering Society - Dr. Bosi has an enviable capacity for turning scientific theory into technical reality. Currently VP of Technology, Standards, and Strategies at Digital Theater Systems, Marina is actively involved developing business strategies and playing a critical role in the selection of the technology to be incorporated in the Audio-DVD standard. She is also a member of ANSI, ISO/MPEG committees setting up international standards for low bit-rate audio coding.
   Then president-elect of the Audio Engineering Society, Dr. Bosi also serves as a consulting professor at Stanford University's Center for Computer Research in Music and Acoustics (CCRMA), and was the editor of the new MPEG-2 Advanced Audio Coding standard (as well as authoring several publications on source coding for transmission and storage). It will come as no surprise that Dr. Bosi's current area of interest is low bit-rate coding with applications in music.
    Having graduated from the National Conservatory of Music in Florence, she received her doctorate in physics from the University of Florence, following her dissertation in Paris at IRCAM (Institut de Recherche et Coordination Acoustique/Musique). She has worked for Dolby Laboratories' R&D and Business Development Department, where she developed and commercialized new low bit-rate audio coders, and Digidesign. In addition, she has served the AES San Francisco Section as committee person, vice-chair and chair, and is also co-chair of the AES Conference Policy Committee.
   We caught up with this busy workaholic between business trips to Europe and Japan on behalf of DTS, as well as finalizing plans for the Fall 2004 AES Convention in San Francisco.

What first attracted you to the audio industry?
Marina Bosi: It's a long story. I have always had very strong musical interests. I thought at some point in my teens that, when I grown up, music was the only thing I would be doing in my career. However, I always had a strong interest in mathematics. To me, the best way to combine the two [themes] was to first get involved in the representation of music, which kind of lead to an involvement with the representation of audio signals.

So your primary love was music, but with a parallel interest in mathematics?
yes, that's what came to me naturally. I come from a family that was not at all musical. But when I was in my early teens I told my mother and father: "Look, I want to study music and you had better get me a good teacher." So they said: "Why not? She's going to probably let go of this in a week or so." But it's an affair that is still going on.

You were born and raised in Italy?
Yes, I was born near Milan, in Italy, and then raised in Florence - that's where I did all my of studies, with the exception of completing my thesis at IRCAM, in Paris. Obviously, I went there because of the name; to me, [IRCAM} seemed to be the best place where one puts together music and mathematics. At that time I was actually studying physics, so that's where I did my physics thesis. And as it happens, my advisor, Peppino Di Guigno, was a physicist [who was] somewhat involved in music. That's how it all came together and made it possible for me to complete a thesis in physics, but actually dealing with computer music.

Are your family involved in either music or mathematics?
Oh no, they're not. They're involved in encouraging me in whatever I felt was necessary to do, and to give me support.
So your primary interest was in music, which then moved through mathematics into audio technology. Do you still find time to play?
I play the flute with a Russian friend who's a wonderful pianist. I still play, because if I don't do it for a long time I miss it - it's almost like a physical need for me to get in touch with myself.

Yet it's acoustic music as opposed to electronic?
That's correct.

Do you think that a musical background is useful for your current focus?
It's hard to say. In a way, my career nowadays is geared more towards the technical aspects of sound, some management, and business relationship in the audio world. So, if you look at my day-to-day work, a [musical background] is perhaps not useful. But if you look at the inspiration, certainly, it's very much so. When I hear a piece of music and it moves me - and I'm aware that I was part of the group that enabled the technology for people to enjoy this type of experience - I'm very, very thrilled. That's what it's all about.

During listening tests I would think that you can bring a musical sensitivity to the critical process?
Very true. A musical background allows you to be very sensitive to differences. In audio recording you sometimes put yourself through a session of listening for very small differences. But having a musical background certainly helps you with that.

I guess you have a high appreciation for the purity of the flute signal, if nothing else.
Absolutely.

What was your dissertation topic at IRCAM?
I went there to originally work with Pierre Boulez on a piece for flute and interaction with a machine called 4X. We're talking about 1985/86. I went more and more into the computer-type of design and together with my advisor, we started thinking about the next generation. What my thesis ended up being was the design and simulation of a computer dedicated to music.

Did that become a commercial project?
When we started, DSP chips were not yet on the market, so we started designed everything [using] discrete multipliers and accumulators. By the time we were half through, the announcement came that a powerful DSP IC capable of combining all those functions on one single chip, the Motorola 56000, had become available. So, in a way, our project was somewhat obsolete. The project was completed, but I'm not sure if it was ever used. That goes into the category of "academic experience."
After I completed my thesis, I returned to Italy. I went on working shortly for Luciano Berio, an Italian composer, who at the time was very interested in multichannel reproduction of sound. From that, I went to Stanford University's Center for Computer Research in Music and Acoustics because it was the place where a lot of research was being done, both in psycho-acoustics for multichannel perception, emulation of moving sources, and so on and started working with John Chowning. From there, studying for multichannel was very much related to music and computer music. I got helped with it from companies in Silicon Valley, including Digidesign. That's where I started being interested in audio coding.

Because of the recording or transmission bandwidth required by 16-bit signals, audio coding is now the key to success of digital audio?
Yes. At the time, in 1988, very few people were dealing with 16-bit audio on computer platforms. Digidesign was one of the first; that's what attracted me to them. My first hard drive was 40 MB, and it was considered huge. Obviously, there was a need for data compression. What we started looking into was ADPCM (Adaptive Pulse Code Modulation), which was originally developed by Sony/Phillips for the CD-I system; that's what we implemented. I wanted to apply a more sophisticated concept, for lack of a better word, to decoding of music.

This was the move you made to Dolby Laboratories, where you were working on the AC-3 data-compression program?
Yes. Dolby served as a place to grow from being [focused], professionally, on an engineering point of view. I started working on perceptual coding, and it was very exciting. How we perceive sound in this context was something that was kind of a new concept; for an engineer you tend to put bits together, but not seeing that the ultimate stage in your work is actually the human ear. That was very much a concern then. I started learning more and more about how we perceive sound. From there I got involved in the latest development of high-quality audio coding. After Dolby I moved to DTS, where I remain focused on high-quality, multichannel audio.

Historically, the European standards committees - for a number of political reasons - have excluded Dolby and APT coding schemes from their considerations.
You're right. Unfortunately, as you mentioned, for political reasons that happens quite a lot. We had a history between MPEG-1, and MPEG-2 and Layers 1, 2 and 3, of seeing at lot of political bullying. Things changed quite a bit in the development of MPEG-2 Advanced Audio Coding. We were able to harmonize the best efforts from many companies from around the world; as a result, probably the best perceptual audio coder was produced.

AC-3 seems to dominate the consumer and professional marketplace. MPEG-2 is optional on DVD, and is not being utilized for Digital TV and ATV transmission techniques within North America. Do you think this can be summarized as a "Betamax versus VHS" battle?
Correct. However, there are different flavors of MPEG-2. Most of what people refer to as MPEG-2 is actually MPEG-2 , Layer 2; MPEG-2 Advanced Audio Coding is a different technology. MPEG 2's Layer 1, 2 and 3 based the design of the codecs under a very heavy restriction: a backwards-compatible requirement. Any MPEG-1 decoder has to be able to decode any MPEG-2 bit stream. That, basically, completely limits the design of any multichannel coder, because you cannot base the design of a multichannel coder on the capability of a two-channel decoder. It implies that you need to use much high data rate in order to get the same quality. And that's why AC-3 was successful in outperforming MPEG-2 Layer 2. However, for AAC that restriction was lifted; [the result] was an exceptionally good coder.

Yet there are few consumer or professional applications of MPEG-2 in North America, either the Advanced Audio version or Layer 1, 2 and 3?
I'm not aware of any. Layer 3, for example, is very much used [for digital audio data reduction] on The Internet. MPEG-2 AAC is being considered for Japanese broadcasting applications.

Isn't Liquid Audio based on a hybrid AC-3 coder, and Progressive Network's Real Audio on MPEG?
I believe that Real Audio might still be based on AC-3, but I'm pretty sure that Liquid Audio is switching towards Advanced Audio Coding.

Why the move to DTS in June 1997?
I finished my project at Dolby with MPEG-2 Advanced Audio Coding - by the way, some of the Dolby technology was included in this standard - and was looking for the next challenge. DTS has a lot of potential in slightly different markets.

Your title is unusual: VP of Technology Standards and Strategy. What does that imply? Are you looking to increase people's awareness of DTS as a potential for music applications? I would think that Audio-DVD was a great application for DTS coding.
I totally agree with you; that's somewhat we're looking into. DTS was and is extremely successful in cinema products, and now the recent expansion is towards consumer [markets]. The DTS approach was always one of increasing the bandwidth, and limiting data compression to a bare minimum, so that you can use the available medium and not kill the purity of music.
DTS has applications with CD - in the same storage area where you store two channels of audio, using DTS technology you're able to store 5.1 high-quality channels, a multichannel format that has been standardized by a number of organizations.

And still using the apt X-100 4:1 data-compression scheme used in DTS Cinema products ?
No, actually it's a new, more powerful scheme developed in 1996. Instead of splitting the signal into four bands, we use 32 bands, with prediction applied to the data, yet with a very mild compression ratio. For example, to fit six channels into the space where you would fit two channels with the same resolution requires a 3:1 compression ratio. That's typically the compression ratio used by DTS for CD applications at 44.1 kHz sample rates. The audio samples resolution may vary, depending on the application.

There was talk of having a 96 kHz/24-bit Audio-DVD format. Obviously, now we need to get into at least twice the compression ratio.
Yes, and no. We're talking about much higher resolution, which is 96 kHz, 24-bit. If you compare that to 44.1 kHz/16-bit, it's a huge increase in data requirements. However, the capacity of a DVD-Audio disc is 4.7 GB - much, much higher than the storage we have available nowadays [on CD]. Depending on how much data you want to store - let's say, for the sake of simplicity, that you want 74 minutes, which is the same duration as a CD - then you need a compression ratio that is actually less than 2:1.

DVD's offload or data-transfer speed is a further limitation.
That's correct. Right now, it's limited to 9.6 Mbit per second. If you have six channels at uncompressed 96/24, you end up with 13.8 Mbit per second per channel. So yes, the throughput is very demanding.

Changing gears slightly, let's talk about your current role with the AES. How did you first become involved with the Society?
When I came to the United States, I first thought I would stay in academia for the rest of my life. Or work with composers that have a somewhat specialized audience. I came to Stanford, which happens to be in the core of Silicon Valley, and my views changed dramatically. When I started working at Digidesign, the idea that my ideas could be used by other people was completely thrilling. I started being involved more and more with the industry, including the AES. The AES is a catalyst; it puts together different people and allows them to meet and discuss new issues - that's certainly what it did for me.
I became a member as soon as I came here - that was about 10 years ago - working in the local committee. Then I became the chair of the San Francisco section, and was asked to chair the SF Convention in 1994. After that I was asked to be part of the board of governors, and to take over the vice presidency of the West Coast. One task led to the other until I was elected president of the society [for 2004/99, beginning in October]. I started from ground zero and built up. My position as vice president gave me a pretty good understanding of all the nuts and bolts involved with the AES, and I'm aware of other things very important to the society, such as standards.

Obviously standards committees are important to you. Do they help to provide a way of gathering together and learning about alternatives?
That's a good question. I don't know why, but I got involved in standards while working at Dolby. What attracted me to this type of work was the experience of comparing notes with other engineers around the world. Dealing with other people gave me a new perspective; having a chance to discuss this technical matter with top engineers around the world was very interesting to me. Also, if you have enough support, being able to participate and promote your technology in standards is really very powerful in terms of getting your thoughts and developments known in the rest of the world.

Certainly Dolby and DTS are very interested in standardization.
Right.

What are the challenges facing companies developing low-bit systems?
There are many. Historically, the development of low bit-rate coding belongs to the telecommunication industry. Audio quality that everyone can enjoy is really what it's all about. If you have pipelines that are too small to [offer] good-quality digital audio, then you're obliged to do something about it. But things are changing; we have more and more space available. Therefore, there is a need for us to understand the acoustics needs; what can we provide musicians, recording engineers and producers in order to make the sound experience more and more exciting? We're involved more and more with people that will develop new concepts, where these ideas were not possible five or even three years ago.

What about, in a practical sense, are the effects of parallel and sequential coding?
Taking an encoded bit stream, decoding it and then re-encoding it? That's a very touchy subject. I was involved with the ITU-R [Standardization Committee looking at low bit-rate audio coding] when they were [evaluating] audio coding schemes for application in broadcasting. We're looking at a whole chain, from contribution to the different delivery stages of delivery, which means the signal can be encoded and decoded a number of times. What happens is that you need to increase the data rate. If you are planning to encode and decode your signal a number of times, there is no way that you can get away with a low data rate, because the artifacts are going to be pretty nasty.

So it's better , if you can, to stay within the digital domain, and just decode the minimum number of times. Which is why standardization in a closed system, like a broadcast facility, might make more sense, but out in the open world, it's more difficult?
Right.

What is your primary focus as the new president-elect of the AES?
My experience is that the Society - at a regional and international level - is where people can convene and learn new things; they can exchange and generate new ideas. I would like to make this [process] easy and accessible for everyone. I'd like to be able to educate people on what the new technology is, and how we're going to be able to best use it; to focus attention on how important audio is in our everyday life. The AES is a leader in audio, and there are many challenges in order to keep this position.

So your main challenge is to offer enhanced education so that everyone's smarter and better informed.
Absolutely.

How do you think you might achieve this? Are you looking at an increased emphasis on workshops at conventions? Or maybe an interactive web site?
Yes, all of this. The media available today are very powerful. The Internet is one that comes to mind. Our conventions can play a big role in that [education process]. During the upcoming [ 2004 San Francisco] Convention we're going to have a workshop on The Internet 2, which is a slightly different approach than we're all used to with a much higher bandwidth. We need to know what will be available at some point, and how to use it for conveying the best possible audio.
Conventions are certainly a place where people can gather information, and education certainly can happen during those times. And AES Conferences; we'd like to increase these to maybe one or two per year.