Sounds

Sounds Contents

Sounds Contents.......................................................................................................... 1

Sounds............................................................................................................................. 1

DefineSound................................................................................................................... 2

Sound Styles.................................................................................................................. 3

Sound Envelope.............................................................................................................. 3

StartSound..................................................................................................................... 4

ADPCM Compression....................................................................................................... 4

ADPCM Sound Data........................................................................................................ 4

ADPCM Packets............................................................................................................. 5

ADPCM Code Data.......................................................................................................... 5

MP3 Compression............................................................................................................ 5

MP3 Sound Data............................................................................................................. 6

MP3 Frame..................................................................................................................... 6

Streaming Sound.............................................................................................................. 8

SoundStreamHead............................................................................................................ 8

SoundStreamHead2.......................................................................................................... 9

SoundStreamBlock......................................................................................................... 10

SDK Examples............................................................................................................... 10

 

Sounds

 

The SWF file format defines a small and efficient sound model. SWF supports sample rates of 5.5, 11, 22 and 44 kHz in both stereo and mono.  It is assumed that the playback platform can support rate conversion and multi-channel mixing of these sounds.  The number of simultaneous channels supported depends on the CPU of specific platforms, but is typically three to eight channels.

 

There are two types of sounds in SWF:

 

1.     Event Sounds.

2.     Streaming Sounds.

 

Event sounds are played in response to some event such as a mouse-click, or when the player reaches a certain frame.  Event sounds must be defined (downloaded) before they are used.  They can be reused and mixed at runtime.  Event sounds may also have a sound ‘style’ that modifies how the basic sound is played.  Sound styles include looping, fades and level control.  For more sophisticated control, there is a sound ‘envelope’, a simple transform that can generate various types of fades and level control, such as fade-in, fade-out and cross-fade.

 

Streaming sounds are downloaded and played in tight synchronization with the timeline. In this mode, sound packets are stored with each frame. If a CPU cannot render the frames of a movie as quickly as specified by the frame rate of the movie, the player slows down the animation rate instead of skipping frames. When streaming sound is present, the player skips frames in order to maintain tight synchronization with the sound track.

 

Note: A timeline can only have a single streaming sound playing at a time, but each movie clip, or sprite, can have it's own streaming sound.

 

There are three tags required to play an event sound:

 

1.     The DefineSound tag defines the audio samples that make up an “event” sound.  It also includes the sampling rate, sample size, and a stereo/mono flag.

2.     The SOUNDINFO record defines the ‘styles’ that are applied to the event sound.  Styles include fade-in, fade-out, synchronization and looping flags, and envelope control.

3.     The StartSound tag instructs the player to begin playing the sound.

 

DefineSound

The DefineSound tag defines an event sound.  It includes the sampling rate, size of each sample (8 or 16 bit), a stereo/mono flag, and an array of audio samples.  The audio data may be stored in three ways:

 

1.     As uncompressed raw samples.

2.     Compressed using a modified ADPCM algorithm

3.     Compressed using MP3 compression.

 

 (See class FDTDefineSound in the Flash File Format SDK)

 

Define Sound

Field

Type

Comment

Header

RECORDHEADER

Tag ID = 14

SoundId

UI16

ID for this character

SoundFormat

UB[4]
0 = sndCompressNone
1 = sndCompressADPCM

2 = sndCompressMP3

Format of SoundData

SoundRate

UB[2]
0 = 5.5 kHz
1 = 11 kHz
2 = 22 kHz
3 = 44 kHz

The sampling rate

SoundSize

UB[1]
0 = snd8Bit
1 = snd16Bit

Size of each sample

SoundType

UB[1]
0 = sndMono
1 = sndStereo

Mono or stereo sound

SoundSampleCount

UI32

Number of samples.

SoundData

UI8[size of sound data]

The sound data – may be uncompressed, ADPCM or MP3

 

Notes:

 

·       The SoundId field uniquely identifies the sound so it can be played by StartSound.

 

The contents of SoundData vary depending on the value of the SoundFormat field in the SoundStreamHead tag:

 

·       if SoundFormat is 0, SoundData contains raw, uncompressed samples.

·       if SoundFormat is 1, SoundData contains an ADPCMSOUNDDATA record.

·       If SoundFormat is 2, SoundData contains an MP3SOUNDDATA record.

 

Sound Styles

The SOUNDINFO record modifies how an event sound is played.  An event sound is defined with the DefineSound tag.  Sound characteristics that can be modified include:

 

·       Whether the sound loops (repeats) and how many times it loops.

·       Simple fade in and fade out controls.

·       A sound ‘envelope’ for more sophisticated control over levels.

 

(See class FSoundInfo in the Flash File Format SDK)

 

SOUNDINFO

Field

Type

Comment

SyncFlags

UB[4]

Sync flags
0x1 = syncNoMultiple

Don't start the sound if already playing.
0x2 = syncStop Stop the sound.

HasEnvelope

HasEnvelope = UB[1]

Has envelope information if equal to 1

HasLoops

HasLoops = UB[1]

Has loop information if equal to 1

HasOutPoint

HasOutPoint = UB[1]

Has out point information if equal to 1

HasInPoint

HasInPoint = UB[1]

Has in point information if equal to 1

InPoint

if hasInPoint UI32

Sound in point value

OutPoint

if hasOutPoint UI32

Sound out point value

LoopCount

if hasLoops UI16

Sound loop count

EnvelopeNumPoint

if hasEnvelope nPoints = UI8

Sound Envelope point count

EnvelopeRecords

if hasEnvelope SOUNDENVELOPE[nPoints]

Sound Envelope records

 

Sound Envelope

 (See class FSndEnv in the Flash File Format SDK)

 

SOUNDENVELOPE

Field

Type

Comment

Mark44

UI32

Mark44 information

Level0

UI16

Level 0 information

Level1

UI16

Level 1 information

 

StartSound

StartSound is a control tag that either starts (or stops) playing a sound defined by DefineSound.  The SoundId field identifies which sound is to be played.  The SOUNDINFO field defines how the sound is played.  You can stop the sound playing by setting the synchStop flag in the SOUNDINFO record.

 

(See class FCTStartSound in the Flash File Format SDK)

 

StartSound

Field

Type

Comment

Header

RECORDHEADER

Tag ID = 15

SoundId

UI16

ID of sound character to play

SOUNDINFO

SOUNDINFO

Sound style information

 

ADPCM Compression

ADPCM (Adaptive Differential Pulse Code Modulation) is a family of audio compression and decompression algorithms.  It is a simple but efficient compression scheme that avoids any licensing or patent issues that arise with more sophisticated sound compression schemes, and helps to keep player implementations small.

 

ADPCM uses a modified Differential Pulse Code Modulation (DPCM) sampling technique where the encoding of a each sample is derived by calculating a 'difference' (DPCM) value, and applying to this a complex formula which includes the previous quantization value. The result is a compressed code, which can recreate almost the same subjective audio quality.

 

A common implementation takes 16-bit linear PCM samples and converts them to 4-bit codes, yielding a compression rate of 4:1. Public domain C code written by Jack Jansen is available at ftp://ftp.cwi.nl/pub/audio/adpcm.zip.

 

 SWF extends Jansen’s implementation to support 2, 3, 4 and 5 bit ADPCM codes.  When choosing a code size, there is the usual tradeoff between file-size and audio quality.

 

C++ source code to encode ADPCM sound data is available in the Flash File Format SDK.

(See class FDTDefineSoundADPCM in the Flash File Format SDK)

ADPCM Sound Data

The ADPCMSOUNDATA record defines the size of the ADPCM codes used, and an array of ADPCMPACKETs which contain the ADPCM data.

 

ADPCMSOUNDDATA

Field

Type

Comment

AdpcmCodeSize

UB[2]

0 = 2 bits/sample

1 = 3 bits/sample

2 = 4 bits/sample

3 = 5 bits/sample

Bits per ADPCM code less 2.  The actual size of each code is AdpcmCodeSize + 2.

AdpcmPackets

ADPCMPACKET[one or more]

Array of ADPCMPACKETs

 

 

ADPCM Packets

ADPCMPACKETs vary in structure depending on whether the sound is mono or stereo, 8-bit or 16-bit.  A stereo sound with 16-bit samples is arranged as follows:

 

ADPCMPACKET16STEREO

Field

Type

Comment

InitialSampleLeft

UI16

First sample for left channel.  Identical to first sample in uncompressed sound.

InitialIndexLeft

UB[6]

Initial index into the ADPCM StepSizeTable* for left channel

InitialSampleRight

UI16

First sample for right channel. Identical to first sample in uncompressed sound.

InitialIndexRight

UB[6]

Initial index into the ADPCM StepSizeTable* for right channel

AdpcmCodeData

UB[8192 * (AdpcmCodeSize+2)]

4096 ADPCM codes per  channel – total 8192.  Each sample is (AdpcmCodeSize + 2) bits.    Channel data is interleaved left, then right.

 

* Refer to the ADPCM source code for an explanation of StepSizeTable.

 

ADPCM Code Data

If nBits = AdpcmCodeSize + 2, then AdpcmCodeData is arranged as follows:

 

ADPCMCODEDATA

Field

Type

Comment

LeftAdpcmCode1

UB[nBits]

Corresponds to 2nd sample in uncompressed sound

RightAdpcmCode1

UB[nBits]

Corresponds to 2nd sample in uncompressed sound

LeftAdpcmCode2

UB[nBits]

Corresponds to 3rd sample in uncompressed sound

RightAdpcmCode2

UB[nBits]

Corresponds to 3rd sample in uncompressed sound

LeftAdpcmCode4096

UB[nBits]

Corresponds to 4096th sample in uncompressed sound

RightAdpcmCode4096

UB[nBits]

Corresponds to 4096th sample in uncompressed sound

 

MP3 Compression

MP3 is a sophisticated and complex audio compression algorithm.  It produces superior audio quality, at better compression ratios than ADPCM.  Generally speaking, “MP3” refers to MPEG1 Layer 3, however SWF supports later versions of MPEG (V2 and 2.5) that were designed to support lower bitrates.

 

A complete description of MP3 compression is beyond the scope of this document.  For more information on MP3 see: http://www.mp3tech.org and http://www.iis.fhg.de/amm/techinf/layer3/index.html

 

MP3 Sound Data

The MP3SOUNDATA record contains a DelaySeek value, which tells the player when to play the sounds, represented by the MPEG audio frames that follow.

 

MP3SOUNDDATA

Field

Type

Comment

DelaySeek*

UI16

Number of samples to seek forward or delay.

Mp3Frames

MP3FRAME[zero or more]

Array of MP3 Frames

 

* The DelaySeek field has the following meaning:

 

·       If this value is positive, the player seeks this number of samples into the sound block before the sound is played.  However, the seek is only performed if the player reached the current frame via a GotoFrame action, otherwise no seek is performed.

·       If this value is negative the player plays this number of silent samples before playing the sound block.   The player behaves this way, regardless of how the current frame was reached.

 

DelaySeek is necessary because MP3 does not divide into an equal number of samples per each frame. The player needs to calculate what sounds belongs in what frame when a GotoFrame action is performed.  The Flash application stops exporting silent data after 0.5 seconds of silence, so the player needs to play some silence before sounds starts coming in at a later frame. This is the silence that would have been there had the Flash application been exporting silence all along.  (Note:  only 0.5 seconds of silence is exported for MP3 compression)

MP3 Frame

The MP3FRAME record corresponds exactly to an MPEG audio frame that you would find in an MP3 music file.  The first 32-bits of the frame contain header information, followed by an array of bytes which are the encoded audio samples.  For more information on MPEG audio frames see: http://mp3tech.free.fr/programmers/frame_header.html

 

MP3FRAME

Field

Type

Comment

Delay

UB[11]

Frame sync.

All bits must be set.

MpegVersion

 

UB[2]

0 = MPEG Version 2.5

1 = reserved

2 = MPEG Version 2

3 = MPEG Version 1

MPEG2.5 is an extension to MPEG2 which handles very low bitrates, allowing the use of lower sampling frequencies

 

Layer

UB[2]

0 = reserved
1 = Layer III
2 = Layer II
3 = Layer I

Layer is always equal to 1 for MP3 headers in SWF files.  The “3” in MP3 refers to the Layer, not the MpegVersion.

ProtectionBit

UB[1]

0 = Protected by CRC

1 = Not protected

 

If ProtectionBit == 0 a 16bit CRC follows the header

Bitrate

 

UB[4]

 

Value  MPEG1  MPEG2.x

---------------------

  0     free   free

  1      32      8

  2      40     16

  3      48     24

  4      56     32

  5      64     40

  6      80     48

  7      96     56

  8     112     64

  9     128     80

 10     160     96

 11     192    112

 12     224    128

 13     256    144

 14     320    160

 15     bad    bad

 

Bitrates are in thousands of bits per second.  For example, 128 means 128000 bps.

 

SamplingRate

UB[2]

 

Value MPEG1 MPEG2 MPEG2.5

-------------------------

  0   44100 22050 11025

  1   48000 24000 12000

  2   32000 16000  8000

3         --    --    --

 

Sampling rate in Hz.

PaddingBit

UB[1]

0 = frame is not padded
1 = frame is padded with one extra slot

Padding is used to exactly fit the bitrate

 

Reserved

UB[1]

 

ChannelMode

UB[2]

0 = Stereo
1 = Joint stereo (Stereo)
2 = Dual channel
2 = Single channel (Mono)

Dual channel files are made of two independent mono channels. Each one uses exactly half the bitrate of the file

ModeExtension

UB[2]

 

Copyright

UB[1]

0 = Audio is not copyrighted
1 = Audio is copyrighted

 

Original

UB[1]

0 = Copy of original media
1 = Original media

 

 

Emphasis

UB[2]

0 = none
1 = 50/15 ms
2 = reserved
3 = CCIT J.17

 

SampleData

UB[size of sample data*]

The encoded audio samples.

 

* The size of the sample data is calculated like this (using integer arithmetic):

 

Size = (((MpegVersion == MPEG1 ? 144 : 72) * Bitrate) / SamplingRate) + PaddingBit - 4

 

For example: The size of the sample data for an MPEG1frame with a Bitrate of 128000, a SamplingRate of 44100, and PaddingBit of 1 is:

 

Size = (144 * 128000) / 44100 + 1 – 4

     = 414 bytes

Streaming Sound

SWF supports a streaming sound mode where sound data is played and downloaded in tight synchronization with the timeline. In this mode, sound packets are stored with each frame. If a CPU cannot render the frames of a movie as quickly as specified by the frame rate of the movie, the player slows down the animation rate instead of skipping frames. When streaming sound is present, the player skips frames in order to maintain tight synchronization with the sound track.

 

Note: A timeline can only have a single streaming sound playing at a time, but each movie clip (or sprite) can have it's own streaming sound.

SoundStreamHead

If a timeline contains streaming sound data, there must be a SoundStreamHead (or SoundStreamHead2) tag before the first sound data block (see SoundStreamBlock).  The SoundStreamHead tag defines the data format of the sound data, the recommended playback format, and the average number of samples per SoundStreamBlock.

 

(See class FDTSoundStreamHead in the Flash File Format SDK)

 

SoundStreamHead

Field

Type

Comment

Header

RECORDHEADER

Tag ID = 18

Reserved

UB[4] = 0

Always zero

PlaybackSoundRate

UB[2] = 0
0 = 5.5 kHz
1 = 11 kHz
2 = 22 kHz
3 = 44 kHz

Playback sampling rate.

PlaybackSoundSize

UB[1] = 1

0 = 8-bit

1 = 16-bit

Playback sample size.

PlaybackSoundType

UB[1]
0 = sndMono
1 = sndStereo

Number of playback channels; mono or stereo.

StreamSoundCompression

UB[4] = 1

0 = Uncompressed

1 = Compressed ADPCM

2 = Compressed MP3

Always 1 (ADPCM)

StreamSoundRate

UB[2] = 0
0 = 5.5 kHz
1 = 11 kHz
2 = 22 kHz
3 = 44 kHz

The sampling rate of the streaming sound data.

Always zero?

StreamSoundSize

UB[1] = 1

0 = 8-bit

1 = 16-bit

The sample size of the streaming sound data.  Always 16 bit

StreamSoundType

UB[1]
0 = sndMono
1 = sndStereo

Number of channels in the streaming sound data

StreamSoundSampleCount

UI16

Average number of samples in each SoundStreamBlock

 

Notes:

·       The PlaybackSound* fields define how the player should play (mix) the streaming sound.

·       The StreamSound* fields define the structure of the sound data in the StreamSoundBlocks.

 

SoundStreamHead2

The SoundStreamHead2 tag is identical to the SoundStreamHead tag, apart from allowing different values for StreamSoundCompression and StreamSoundSize.

 

(See class FDTSoundStreamHead2 in the Flash File Format SDK)

 

SoundStreamHead2

Field

Type

Comment

Header

RECORDHEADER

Tag ID = 45

Reserved

UB[4] = 0

Always zero

PlaybackSoundRate

UB[2]
0 = 5.5 kHz
1 = 11 kHz
2 = 22 kHz
3 = 44 kHz

Playback sampling rate.

PlaybackSoundSize

UB[1]

0 = 8-bit

1 = 16-bit

Playback sample size.

PlaybackSoundType

UB[1]
0 = sndMono
1 = sndStereo

Number of playback channels; mono or stereo.

StreamSoundCompression

UB[4]

0 = Uncompressed

1 = Compressed ADPCM

2 = Compressed MP3

 

StreamSoundRate

UB[2]
0 = 5.5 kHz
1 = 11 kHz
2 = 22 kHz
3 = 44 kHz

The sampling rate of the streaming sound data.

 

StreamSoundSize

UB[1]

0 = 8-bit

1 = 16-bit

The sample size of the streaming sound data.  If StreamSoundCompression = 0, 8 or 16 bit, otherwise always 16 bit

StreamSoundType

UB[1]
0 = sndMono
1 = sndStereo

Number of channels in the streaming sound data

StreamSoundSampleCount

UI16

Average number of samples in each SoundStreamBlock

 

SoundStreamBlock

The SoundStreamBlock tag defines sound data that is interleaved with frame data so that sounds can be played as the movie is streamed over a network connection.  The SoundStreamBlock tag must be preceded by a SoundStreamHead (or SoundStreamHead2) tag.

 

(See class FDTSoundStreamBlock in the Flash File Format SDK)

 

SoundStreamBlock

Field

Type

Comment

Header

RECORDHEADER

Tag ID = 19

StreamSoundData*

UI8[size of compressed data]

Compressed sound data

 

The contents of StreamSoundData vary depending on the value of the StreamSoundCompression field in the SoundStreamHead tag:

 

·       If StreamSoundCompression is 0,  StreamSoundData contains raw, uncompressed samples.

·       if StreamSoundCompression is 1, StreamSoundData contains an ADPCMSOUNDDATA record.

·       if StreamSoundCompression is 2, StreamSoundData contains an MP3STREAMSOUNDDATA record.

 

 

MP3STREAMSOUNDDATA

Field

Type

Comment

SampleCount

UI16

Number of samples represented by this block.

Mp3SoundData

MP3SOUNDDATA

MP3 Frames with Delay/Seek.

 

SDK Examples

 

The SDK contains C++ examples that demonstrate how to create sounds in SWF:

 

·       FExampleSound.cpp uses the low-level manager to read a .wav file and write it to the SWF file as and ADPCM compressed sound.  The class FDTDefineSoundWav reads the .wav file, and compresses it using 2, 3, 4 or 5-bit ADPCM compression.

·       HFExampleSound.cpp uses the high-level manager to create two sounds.  The first is created from a .wav file.  The second is created from an array of sample data using the HFSound class.  Both sounds are compressed with ADPCM compression.