version 3, including all changes.
.
Rev |
Author |
# |
Line |
3 |
JohnMcPherson |
1 |
See also our SoxNotes for user-submitted examples/comments. |
|
|
2 |
---- |
1 |
perry |
3 |
SoX |
|
|
4 |
!!!SoX |
|
|
5 |
NAME |
|
|
6 |
SYNOPSIS |
|
|
7 |
DESCRIPTION |
|
|
8 |
OPTIONS |
|
|
9 |
FILE TYPES |
|
|
10 |
EFFECTS |
|
|
11 |
BUGS |
|
|
12 |
FILES |
|
|
13 |
SEE ALSO |
|
|
14 |
NOTICES |
|
|
15 |
AUTHOR |
|
|
16 |
---- |
|
|
17 |
!!NAME |
|
|
18 |
|
|
|
19 |
|
|
|
20 |
sox - Sound eXchange : universal sound sample translator |
|
|
21 |
!!SYNOPSIS |
|
|
22 |
|
|
|
23 |
|
|
|
24 |
__sox__ ''infile outfile'' |
|
|
25 |
|
|
|
26 |
|
|
|
27 |
__sox__ [[ ''general options'' ] [[ ''format |
|
|
28 |
options'' ] ''infile'' |
|
|
29 |
[[ ''format options'' ] ''outfile'' |
|
|
30 |
[[ ''effect'' [[ ''effect options'' ] ... ] |
|
|
31 |
|
|
|
32 |
|
|
|
33 |
__soxmix__ ''infile1 infile2 outfile'' |
|
|
34 |
|
|
|
35 |
|
|
|
36 |
__soxmix__ [[ ''general options'' ] [[ ''format |
|
|
37 |
options'' ] ''infile1'' |
|
|
38 |
[[ ''format options'' ] ''infile2'' |
|
|
39 |
[[ ''format options'' ] ''outfile'' |
|
|
40 |
[[ ''effect'' [[ ''effect options'' ] ... ] |
|
|
41 |
|
|
|
42 |
|
|
|
43 |
__General options:__ |
|
|
44 |
[[ -h ] [[ -p ] [[ -v ''volume'' ] [[ -V ] |
|
|
45 |
|
|
|
46 |
|
|
|
47 |
__Format options:__ |
|
|
48 |
[[ -t ''filetype'' ] [[ -r ''rate'' ] [[ |
|
|
49 |
-s/-u/-U/-A/-a/-i/-g/-f ] [[ -b/-w/-l ] [[ -c ''channels'' |
|
|
50 |
] [[ -x ] [[ -e ] |
|
|
51 |
|
|
|
52 |
|
|
|
53 |
__Effects: |
|
|
54 |
avg__ [[ -l | -r | -f | -b | n,n,...,n ]__ |
|
|
55 |
band__ [[ -n ] ''center'' [[ ''width'' ]__ |
|
|
56 |
bandpass__ ''frequency bandwidth''__ |
|
|
57 |
bandreject__ ''frequency bandwidth''__ |
|
|
58 |
chorus__ ''gain-in gain out delay decay speed |
|
|
59 |
depth'' |
|
|
60 |
-s | -t [[ ''delay decay speed depth'' -s | -t ]__ |
|
|
61 |
compand__ |
|
|
62 |
''attack1'',''decay1''[[,''attack2'',''decay2''...]'' |
|
|
63 |
in-dB1'',''out-dB1''[[,''in-dB2'',''out-dB2''...] |
|
|
64 |
[[ ''gain'' [[ ''initial-volume'' [[ ''delay'' ] ] |
|
|
65 |
]__ |
|
|
66 |
copy |
|
|
67 |
dcshift__ ''shift'' [[ ''limitergain'' ]__ |
|
|
68 |
deemph |
|
|
69 |
earwax |
|
|
70 |
echo__ ''gain-in gain-out delay decay'' [[ ''delay |
|
|
71 |
decay ...'' ]__ |
|
|
72 |
echos__ ''gain-in gain-out delay decay'' [[ ''delay |
|
|
73 |
decay ...'' ]__ |
|
|
74 |
fade__ [[ ''type'' ] ''fade-in-length'' [[ |
|
|
75 |
''stop-time'' [[ ''fade-out-length'' ] ]__ |
|
|
76 |
filter__ [[ ''low'' ]-[[ ''high'' ] [[ |
|
|
77 |
''window-len'' [[ ''beta'' ]]__ |
|
|
78 |
flanger__ ''gain-in gain-out delay decay speed'' |
|
|
79 |
'' |
|
|
80 |
highp__ ''frequency''__ |
|
|
81 |
highpass__ ''frequency''__ |
|
|
82 |
lowp__ ''frequency''__ |
|
|
83 |
lowpass__ ''frequency''__ |
|
|
84 |
map |
|
|
85 |
mask |
|
|
86 |
pan__ ''direction''__ |
|
|
87 |
phaser__ ''gain-in gain-out delay decay speed'' |
|
|
88 |
'' |
|
|
89 |
pick__ [[ ''-1'' | ''-2'' | ''-3'' | ''-4'' | |
|
|
90 |
''-l'' | ''-r'' ]__ |
|
|
91 |
pitch__ ''shift'' [[ ''width interpole fade'' |
|
|
92 |
]__ |
|
|
93 |
polyphase__ [[ -w __nut'' / ''ham'' |
|
|
94 |
''-width'' ''long'' / ''short'' / # |
|
|
95 |
''-cutoff #'' ]__ |
|
|
96 |
rate |
|
|
97 |
resample__ [[ -qs | -q | -ql ] [[ ''rolloff'' [[ |
|
|
98 |
''beta'' ] ]__ |
|
|
99 |
reverb__ ''gain-out reverb-time delay'' [[ ''delay'' |
|
|
100 |
... ]__ |
|
|
101 |
reverse |
|
|
102 |
silence__ ''above_periods'' [[ ''duration |
|
|
103 |
threshold''[[ ''d'' | ''%'' ] [[ ''below_periods |
|
|
104 |
duration threshold''[[ ''d'' | ''%'' ]]__ |
|
|
105 |
speed__ [[ -c ] ''factor''__ |
|
|
106 |
split |
|
|
107 |
stat__ [[ -s ''n'' ] [[ -rms ] [[ -v ] [[ -d ]__ |
|
|
108 |
stretch__ [[ ''factor'' [[ ''window fade shift |
|
|
109 |
fading'' ]__ |
|
|
110 |
swap__ [[ ''1 2'' | ''1 2 3 4'' ]__ |
|
|
111 |
synth__ [[ ''length'' ] ''type mix'' [[ ''freq'' [[ |
|
|
112 |
''-freq2'' ] [[ ''off'' ] [[ ''ph'' ] [[ ''p1'' ] [[ |
|
|
113 |
''p2'' ] [[ ''p3'' ]__ |
|
|
114 |
trim__ ''start'' [[ ''length'' ]__ |
|
|
115 |
vibro__ ''speed'' [[ ''depth'' ]__ |
|
|
116 |
vol__ ''gain'' [[ ''type'' [[ ''limitergain'' ] |
|
|
117 |
] |
|
|
118 |
!!DESCRIPTION |
|
|
119 |
|
|
|
120 |
|
|
|
121 |
''SoX'' is a command line program that can convert most |
|
|
122 |
popular audio files to most other popular audio file |
|
|
123 |
formats. It can optionally change the audio sample data type |
|
|
124 |
and apply one or more sound effects to the file during this |
|
|
125 |
translation. |
|
|
126 |
|
|
|
127 |
|
|
|
128 |
''soxmix'' is functionally the same as the command line |
|
|
129 |
program ''sox'' expect that it takes two files as input |
|
|
130 |
and mixes the audio together to produce a single file as |
|
|
131 |
output. It has a restriction that both input files must be |
|
|
132 |
of the same data type and sample rates. |
|
|
133 |
|
|
|
134 |
|
|
|
135 |
There are two types of audio files formats that ''SoX'' |
|
|
136 |
can work with. The first are self-describing file formats. |
|
|
137 |
These contain a header that completely describe the |
|
|
138 |
characteristics of the audio data that follows. |
|
|
139 |
|
|
|
140 |
|
|
|
141 |
The second type are header-less data, or sometimes called |
|
|
142 |
raw data. A user must pass enough information to ''SoX'' |
|
|
143 |
on the command line so that it knows what type of data it |
|
|
144 |
contains. |
|
|
145 |
|
|
|
146 |
|
|
|
147 |
Audio data can usually be totally described by four |
|
|
148 |
characteristics: |
|
|
149 |
|
|
|
150 |
|
|
|
151 |
rate |
|
|
152 |
|
|
|
153 |
|
|
|
154 |
The sample rate is in samples per second. For example, CD |
|
|
155 |
sample rates are at 44100. |
|
|
156 |
|
|
|
157 |
|
|
|
158 |
data size |
|
|
159 |
|
|
|
160 |
|
|
|
161 |
The precision the data is stored in. Most popular are 8-bit |
|
|
162 |
bytes or 16-bit words. |
|
|
163 |
|
|
|
164 |
|
|
|
165 |
data encoding |
|
|
166 |
|
|
|
167 |
|
|
|
168 |
What encoding the data type uses. Examples are u-law, ADPCM, |
|
|
169 |
or signed linear data. |
|
|
170 |
|
|
|
171 |
|
|
|
172 |
channels |
|
|
173 |
|
|
|
174 |
|
|
|
175 |
How many channels are contained in the audio data. Mono and |
|
|
176 |
Stereo are the two most common. |
|
|
177 |
|
|
|
178 |
|
|
|
179 |
Please refer to the __soxexam(1)__ manual page for a long |
|
|
180 |
description with examples on how to use SoX with various |
|
|
181 |
types of file formats. |
|
|
182 |
!!OPTIONS |
|
|
183 |
|
|
|
184 |
|
|
|
185 |
The option syntax is a little grotty, but in |
|
|
186 |
essence: |
|
|
187 |
|
|
|
188 |
|
|
|
189 |
sox File.au file.wav |
|
|
190 |
|
|
|
191 |
|
|
|
192 |
translates a sound file in SUN Sparc .AU format into a |
|
|
193 |
Microsoft .WAV file, while |
|
|
194 |
|
|
|
195 |
|
|
|
196 |
sox -v 0.5 file.au -r 12000 file.wav mask |
|
|
197 |
|
|
|
198 |
|
|
|
199 |
does the same format translation but also lowers the |
|
|
200 |
amplitude by 1/2, changes the sampling rate to 12000 hertz, |
|
|
201 |
and applies the __mask__ sound effect to the audio |
|
|
202 |
data. |
|
|
203 |
|
|
|
204 |
|
|
|
205 |
The following will mix two sound files together to to |
|
|
206 |
produce a single sound file. |
|
|
207 |
|
|
|
208 |
|
|
|
209 |
soxmix music.wav voice.wav mixed.wav |
|
|
210 |
|
|
|
211 |
|
|
|
212 |
__Format options:__ |
|
|
213 |
|
|
|
214 |
|
|
|
215 |
Format options effect the audio samples that they |
|
|
216 |
immediately precede. If they are placed before the input |
|
|
217 |
file name then they effect the input data. If they are |
|
|
218 |
placed before the output file name then they will effect the |
|
|
219 |
output data. By taking advantage of this, you can override a |
|
|
220 |
input file's corrupted header or produce an output file that |
|
|
221 |
is totally different style then the input file. It is also |
|
|
222 |
how SoX is informed about the format of raw input |
|
|
223 |
data. |
|
|
224 |
|
|
|
225 |
|
|
|
226 |
__-t__ ''filetype'' |
|
|
227 |
|
|
|
228 |
|
|
|
229 |
gives the type of the sound sample file. Useful when file |
|
|
230 |
extension is not standard or for specifying the .auto file |
|
|
231 |
type. |
|
|
232 |
|
|
|
233 |
|
|
|
234 |
__-r__ ''rate'' |
|
|
235 |
|
|
|
236 |
|
|
|
237 |
Gives the sample rate in Hertz of the file. To cause the |
|
|
238 |
output file to have a different sample rate than the input |
|
|
239 |
file, include this option as a part of the output |
|
|
240 |
options. |
|
|
241 |
If the input and output files have different rates then a |
|
|
242 |
sample rate change effect must be ran. If a sample rate |
|
|
243 |
changing effect is not specified then a default one will |
|
|
244 |
internally be ran by SoX using its default |
|
|
245 |
parameters. |
|
|
246 |
|
|
|
247 |
|
|
|
248 |
__-s/-u/-U/-A/-a/-i/-g/-f__ |
|
|
249 |
|
|
|
250 |
|
|
|
251 |
The sample data encoding is signed linear (2's complement), |
|
|
252 |
unsigned linear, u-law (logarithmic), A-law (logarithmic), |
|
|
253 |
ADPCM, IMA_ADPCM, GSM, or Floating-point. |
|
|
254 |
U-law (actually shorthand for mu-law) and A-law are the U.S. |
|
|
255 |
and international standards for logarithmic telephone sound |
|
|
256 |
compression. When uncompressed u-law has roughly the |
|
|
257 |
precision of 14-byte PCM audio and A-law has roughly the |
|
|
258 |
precision of 13-bit PCM audio. |
|
|
259 |
A-law and u-law data is sometimes encoded using a reversed |
|
|
260 |
bit-ordering (ie. MSB becomes LSB). Internally, SoX |
|
|
261 |
understands how to work with this encoding but there is |
|
|
262 |
currently no command line option to specify it. If you need |
|
|
263 |
this support then you can use the psuedo file types of |
|
|
264 |
ADPCM is a form of sound compression that has a good |
|
|
265 |
compromise between good sound quality and fast |
|
|
266 |
encoding/decoding time. It is used for telephone sound |
|
|
267 |
compression and places were full fidelity is not as |
|
|
268 |
important. When uncompressed it has roughly the precision of |
|
|
269 |
16-bit PCM audio. Popular version of ADPCM include G.726, MS |
|
|
270 |
ADPCM, and IMA ADPCM. The __-a__ flag has different |
|
|
271 |
meanings in different file handlers. In __.wav__ files it |
|
|
272 |
represents MS ADPCM files, in all others it means G.726 |
|
|
273 |
ADPCM. IMA ADPCM is a specific form of ADPCM compression, |
|
|
274 |
slightly simpler and slightly lower fidelity than |
|
|
275 |
Microsoft's flavor of ADPCM. IMA ADPCM is also called DVI |
|
|
276 |
ADPCM. |
|
|
277 |
GSM is a standard used for telephone sound compression in |
|
|
278 |
European countries and its gaining popularity because of its |
|
|
279 |
quality. It usually is CPU intensive to work with GSM audio |
|
|
280 |
data. |
|
|
281 |
|
|
|
282 |
|
|
|
283 |
__-b/-w/-l__ |
|
|
284 |
|
|
|
285 |
|
|
|
286 |
The sample data size is in bytes, 16-bit words, or 32-bit |
|
|
287 |
long words. |
|
|
288 |
|
|
|
289 |
|
|
|
290 |
__-x__ The sample data is in XINU format; that is, it |
|
|
291 |
comes from a machine with the opposite word order than yours |
|
|
292 |
and must be swapped according to the word-size given above. |
|
|
293 |
Only 16-bit and 32-bit integer data may be swapped. |
|
|
294 |
Machine-format floating-point data is not |
|
|
295 |
portable. |
|
|
296 |
|
|
|
297 |
|
|
|
298 |
__-c__ ''channels'' |
|
|
299 |
|
|
|
300 |
|
|
|
301 |
The number of sound channels in the data file. This may be |
|
|
302 |
1, 2, or 4; for mono, stereo, or quad sound data. To cause |
|
|
303 |
the output file to have a different number of channels than |
|
|
304 |
the input file, include this option with the output file |
|
|
305 |
options. If the input and output file have a different |
|
|
306 |
number of channels then the avg effect must be used. If the |
|
|
307 |
avg effect is not specified on the command line it will be |
|
|
308 |
invoked internally with default parameters. |
|
|
309 |
|
|
|
310 |
|
|
|
311 |
__-e__ When used after the input filename (so that it |
|
|
312 |
applies to the output file) it allows you to avoid giving an |
|
|
313 |
output filename and will not produce an output file. It will |
|
|
314 |
apply any specified effects to the input file. This is |
|
|
315 |
mainly useful with the __stat__ effect but can be used |
|
|
316 |
with others. |
|
|
317 |
|
|
|
318 |
|
|
|
319 |
__General options:__ |
|
|
320 |
|
|
|
321 |
|
|
|
322 |
__-h__ Print version number and usage |
|
|
323 |
information. |
|
|
324 |
|
|
|
325 |
|
|
|
326 |
__-p__ Run in preview mode and run fast. This will |
|
|
327 |
somewhat speed up SoX when the output format has a different |
|
|
328 |
number of channels and a different rate than the input file. |
|
|
329 |
Currently, this defaults to using the __rate__ effect |
|
|
330 |
instead of the __resample__ effect for sample rate |
|
|
331 |
changes. |
|
|
332 |
|
|
|
333 |
|
|
|
334 |
__-v__ ''volume'' |
|
|
335 |
|
|
|
336 |
|
|
|
337 |
Change amplitude (floating point); less than 1.0 decreases, |
|
|
338 |
greater than 1.0 increases. May use a negative number to |
|
|
339 |
invert the phase of the audio data. It is interesting to |
|
|
340 |
note that we perceive volume logarithmically but this |
|
|
341 |
adjusts the amplitude linearly. |
|
|
342 |
Note: see the __stat__ effect for information on finding |
|
|
343 |
the maximum value that can be used with this option without |
|
|
344 |
causing audio data be be clipped. |
|
|
345 |
|
|
|
346 |
|
|
|
347 |
__-V__ Print a description of processing phases. Useful |
|
|
348 |
for figuring out exactly how ''SoX'' is mangling your |
|
|
349 |
sound samples. |
|
|
350 |
!!FILE TYPES |
|
|
351 |
|
|
|
352 |
|
|
|
353 |
''SoX'' attempts to determine the file type of input |
|
|
354 |
files automatically by looking at the header of the audio |
|
|
355 |
file. When it is unable to detect the file type or if its an |
|
|
356 |
output file then it uses the file extension of the file to |
|
|
357 |
determine what type of file format handler to use. This can |
|
|
358 |
be overridden by specifying the |
|
|
359 |
'' |
|
|
360 |
|
|
|
361 |
|
|
|
362 |
The input and output files may be read from standard in and |
|
|
363 |
out. This is done by specifying '-' as the |
|
|
364 |
filename. |
|
|
365 |
|
|
|
366 |
|
|
|
367 |
File formats which have headers are checked, if that header |
|
|
368 |
doesn't seem right, the program exits with an appropriate |
|
|
369 |
message. |
|
|
370 |
|
|
|
371 |
|
|
|
372 |
The following file formats are supported: |
|
|
373 |
|
|
|
374 |
|
|
|
375 |
__.8svx__ |
|
|
376 |
|
|
|
377 |
|
|
|
378 |
Amiga 8SVX musical instrument description |
|
|
379 |
format. |
|
|
380 |
|
|
|
381 |
|
|
|
382 |
__.aiff__ |
|
|
383 |
|
|
|
384 |
|
|
|
385 |
AIFF files used on Apple IIc/IIgs and SGI. Note: the AIFF |
|
|
386 |
format supports only one SSND chunk. It does not support |
|
|
387 |
multiple sound chunks, or the 8SVX musical instrument |
|
|
388 |
description format. AIFF files are multimedia archives and |
|
|
389 |
can have multiple audio and picture chunks. You may need a |
|
|
390 |
separate archiver to work with them. |
|
|
391 |
|
|
|
392 |
|
|
|
393 |
__.au__ |
|
|
394 |
|
|
|
395 |
|
|
|
396 |
SUN Microsystems AU files. There are apparently many types |
|
|
397 |
of .au files; DEC has invented its own with a different |
|
|
398 |
magic number and word order. The .au handler can read these |
|
|
399 |
files but will not write them. Some .au files have valid AU |
|
|
400 |
headers and some do not. The latter are probably original |
|
|
401 |
SUN u-law 8000 hz samples. These can be dealt with using the |
|
|
402 |
__.ul__ format (see below). |
|
|
403 |
|
|
|
404 |
|
|
|
405 |
__.avr__ |
|
|
406 |
|
|
|
407 |
|
|
|
408 |
Audio Visual Research |
|
|
409 |
The AVR format is produced by a number of commercial |
|
|
410 |
packages on the Mac. |
|
|
411 |
|
|
|
412 |
|
|
|
413 |
__.cdr__ |
|
|
414 |
|
|
|
415 |
|
|
|
416 |
CD-R |
|
|
417 |
CD-R files are used in mastering music on Compact Disks. The |
|
|
418 |
audio data on a CD-R disk is a raw audio file with a format |
|
|
419 |
of stereo 16-bit signed samples at a 44khz sample rate. |
|
|
420 |
There is a special blocking/padding oddity at the end of the |
|
|
421 |
audio file and is why it needs its own handler. |
|
|
422 |
|
|
|
423 |
|
|
|
424 |
__.cvs__ |
|
|
425 |
|
|
|
426 |
|
|
|
427 |
Continuously Variable Slope Delta modulation |
|
|
428 |
Used to compress speech audio for applications such as voice |
|
|
429 |
mail. |
|
|
430 |
|
|
|
431 |
|
|
|
432 |
__.dat__ |
|
|
433 |
|
|
|
434 |
|
|
|
435 |
Text Data files |
|
|
436 |
These files contain a textual representation of the sample |
|
|
437 |
data. There is one line at the beginning that contains the |
|
|
438 |
sample rate. Subsequent lines contain two numeric data |
|
|
439 |
items: the time since the beginning of the first sample and |
|
|
440 |
the sample value. Values are normalized so that the maximum |
|
|
441 |
and minimum are 1.00 and -1.00. This file format can be used |
|
|
442 |
to create data files for external programs such as FFT |
|
|
443 |
analyzers or graph routines. SoX can also convert a file in |
|
|
444 |
this format back into one of the other file |
|
|
445 |
formats. |
|
|
446 |
|
|
|
447 |
|
|
|
448 |
__.gsm__ |
|
|
449 |
|
|
|
450 |
|
|
|
451 |
GSM 06.10 Lossy Speech Compression |
|
|
452 |
A standard for compressing speech which is used in the |
|
|
453 |
Global Standard for Mobil telecommunications (GSM). Its good |
|
|
454 |
for its purpose, shrinking audio data size, but it will |
|
|
455 |
introduce lots of noise when a given sound sample is encoded |
|
|
456 |
and decoded multiple times. This format is used by some |
|
|
457 |
voice mail applications. It is rather CPU intensive. |
|
|
458 |
GSM in __SoX__ is optional and requires access to an |
|
|
459 |
external GSM library. To see if there is support for gsm run |
|
|
460 |
__sox -h__ and look for it under the list of supported |
|
|
461 |
file formats. |
|
|
462 |
|
|
|
463 |
|
|
|
464 |
__.hcom__ |
|
|
465 |
|
|
|
466 |
|
|
|
467 |
Macintosh HCOM files. These are (apparently) Mac FSSD files |
|
|
468 |
with some variant of Huffman compression. The Macintosh has |
|
|
469 |
wacky file formats and this format handler apparently |
|
|
470 |
doesn't handle all the ones it should. Mac users will need |
|
|
471 |
your usual arsenal of file converters to deal with an HCOM |
|
|
472 |
file under Unix or DOS. |
|
|
473 |
|
|
|
474 |
|
|
|
475 |
__.maud__ |
|
|
476 |
|
|
|
477 |
|
|
|
478 |
An Amiga format |
2 |
perry |
479 |
An IFF-conform sound file type, registered by MS !MacroSystem |
1 |
perry |
480 |
Computer GmbH, published along with the |
|
|
481 |
|
|
|
482 |
|
|
|
483 |
__.nul__ |
|
|
484 |
|
|
|
485 |
|
|
|
486 |
Null file handler. This is a fake file hander that act as if |
|
|
487 |
its reading a stream of 0's from a while or fake writing |
|
|
488 |
output to a file. This is not a very useful file handler in |
|
|
489 |
most cases. It might be useful in some scripts were you do |
|
|
490 |
not want to read or write from a real file but would like to |
|
|
491 |
specify a filename for consistency. |
|
|
492 |
|
|
|
493 |
|
|
|
494 |
__.ogg__ |
|
|
495 |
|
|
|
496 |
|
|
|
497 |
Ogg Vorbis Compressed Audio. |
|
|
498 |
Ogg Vorbis is a open, patent-free CODEC designed for |
|
|
499 |
compressing music and streaming audio. It is similar to MP3, |
|
|
500 |
VQF, AAC, and other lossy formats. __SoX__ can decode all |
|
|
501 |
types of Ogg Vorbis files, but can only encode at 128 kbps. |
|
|
502 |
Decoding is somewhat CPU intensive and encoding is very CPU |
|
|
503 |
intensive. |
|
|
504 |
Ogg Vorbis in __SoX__ is optional and requires access to |
|
|
505 |
external Ogg Vorbis libraries. To see if there is support |
|
|
506 |
for Ogg Vorbis run __sox -h__ and look for it under the |
|
|
507 |
list of supported file formats as |
|
|
508 |
__ |
|
|
509 |
|
|
|
510 |
|
|
|
511 |
__ossdsp__ |
|
|
512 |
|
|
|
513 |
|
|
|
514 |
OSS /dev/dsp device driver |
|
|
515 |
This is a pseudo-file type and can be optionally compiled |
|
|
516 |
into SoX. Run __sox -h__ to see if you have support for |
|
|
517 |
this file type. When this driver is used it allows you to |
|
|
518 |
open up the OSS /dev/dsp file and configure it to use the |
|
|
519 |
same data format as passed in to __SoX__. It works for |
|
|
520 |
both playing and recording sound samples. When playing sound |
|
|
521 |
files it attempts to set up the OSS driver to use the same |
|
|
522 |
format as the input file. It is suggested to always override |
|
|
523 |
the output values to use the highest quality samples your |
|
|
524 |
sound card can handle. Example: ''-t ossdsp -w -s |
|
|
525 |
/dev/dsp'' |
|
|
526 |
|
|
|
527 |
|
|
|
528 |
__.sf__ |
|
|
529 |
|
|
|
530 |
|
|
|
531 |
IRCAM Sound Files. |
|
|
532 |
Sound Files are used by academic music software such as the |
2 |
perry |
533 |
CSound package, and the !MixView sound sample |
1 |
perry |
534 |
editor. |
|
|
535 |
|
|
|
536 |
|
|
|
537 |
__.sph__ |
|
|
538 |
|
|
|
539 |
|
|
|
540 |
SPHERE (SPeech HEader Resources) is a file format defined by |
|
|
541 |
NIST (National Institute of Standards and Technology) and is |
|
|
542 |
used with speech audio. SoX can read these files when they |
|
|
543 |
contain u-law and PCM data. It will ignore any header |
|
|
544 |
information that says the data is compressed using |
|
|
545 |
''shorten'' compression and will treat the data as either |
|
|
546 |
u-law or PCM. This will allow SoX and the command line |
|
|
547 |
''shorten'' program to be ran together using pipes to |
|
|
548 |
uncompress the data and then pass the result to SoX for |
|
|
549 |
processing. |
|
|
550 |
|
|
|
551 |
|
|
|
552 |
__.smp__ |
|
|
553 |
|
|
|
554 |
|
2 |
perry |
555 |
Turtle Beach !SampleVision files. |
|
|
556 |
SMP files are for use with the PC-DOS package !SampleVision |
1 |
perry |
557 |
by Turtle Beach Softworks. This package is for communication |
|
|
558 |
to several MIDI samplers. All sample rates are supported by |
|
|
559 |
the package, although not all are supported by the samplers |
|
|
560 |
themselves. Currently loop points are ignored. |
|
|
561 |
|
|
|
562 |
|
|
|
563 |
__.snd__ |
|
|
564 |
|
|
|
565 |
|
|
|
566 |
Under DOS this file format is the same as the __.sndt__ |
|
|
567 |
format. Under all other platforms it is the same as the |
|
|
568 |
__.au__ format. |
|
|
569 |
|
|
|
570 |
|
|
|
571 |
__.sndt__ |
|
|
572 |
|
|
|
573 |
|
2 |
perry |
574 |
!SoundTool files. |
1 |
perry |
575 |
This is an older DOS file format. |
|
|
576 |
|
|
|
577 |
|
|
|
578 |
__sunau__ |
|
|
579 |
|
|
|
580 |
|
|
|
581 |
Sun /dev/audio device driver |
|
|
582 |
This is a pseudo-file type and can be optionally compiled |
|
|
583 |
into SoX. Run __sox -h__ to see if you have support for |
|
|
584 |
this file type. When this driver is used it allows you to |
|
|
585 |
open up a Sun /dev/audio file and configure it to use the |
|
|
586 |
same data type as passed in to __SoX.__ It works for both |
|
|
587 |
playing and recording sound samples. When playing sound |
|
|
588 |
files it attempts to set up the audio driver to use the same |
|
|
589 |
format as the input file. It is suggested to always override |
|
|
590 |
the output values to use the highest quality samples your |
|
|
591 |
hardware can handle. Example: ''-t sunau -w -s |
|
|
592 |
/dev/audio'' or ''-t sunau -U -c 1 /dev/audio'' for |
|
|
593 |
older sun equipment. |
|
|
594 |
|
|
|
595 |
|
|
|
596 |
__.txw__ |
|
|
597 |
|
|
|
598 |
|
|
|
599 |
Yamaha TX-16W sampler. |
|
|
600 |
A file format from a Yamaha sampling keyboard which wrote |
|
|
601 |
IBM-PC format 3.5 |
|
|
602 |
|
|
|
603 |
|
|
|
604 |
__.vms__ |
|
|
605 |
|
|
|
606 |
|
|
|
607 |
More info to come. |
|
|
608 |
Used to compress speech audio for applications such as voice |
|
|
609 |
mail. |
|
|
610 |
|
|
|
611 |
|
|
|
612 |
__.voc__ |
|
|
613 |
|
|
|
614 |
|
|
|
615 |
Sound Blaster VOC files. |
|
|
616 |
VOC files are multi-part and contain silence parts, looping, |
|
|
617 |
and different sample rates for different chunks. On input, |
|
|
618 |
the silence parts are filled out, loops are rejected, and |
|
|
619 |
sample data with a new sample rate is rejected. Silence with |
|
|
620 |
a different sample rate is generated appropriately. On |
|
|
621 |
output, silence is not detected, nor are impossible sample |
|
|
622 |
rates. Note, this version now supports playing VOC files |
|
|
623 |
with multiple blocks and supports playing files containing |
|
|
624 |
u-law and A-law samples. |
|
|
625 |
|
|
|
626 |
|
|
|
627 |
__vorbis__ |
|
|
628 |
|
|
|
629 |
|
|
|
630 |
See __.ogg__ format. |
|
|
631 |
|
|
|
632 |
|
|
|
633 |
__.wav__ |
|
|
634 |
|
|
|
635 |
|
|
|
636 |
Microsoft .WAV RIFF files. |
|
|
637 |
These appear to be very similar to IFF files, but not the |
|
|
638 |
same. They are the native sound file format of Windows. |
|
|
639 |
(Obviously, Windows was of such incredible importance to the |
|
|
640 |
computer industry that it just had to have its own sound |
|
|
641 |
file format.) Normally __.wav__ files have all formatting |
|
|
642 |
information in their headers, and so do not need any format |
|
|
643 |
options specified for an input file. If any are, they will |
|
|
644 |
override the file header, and you will be warned to this |
|
|
645 |
effect. You had better know what you are doing! Output |
|
|
646 |
format options will cause a format conversion, and the |
|
|
647 |
__.wav__ will written appropriately. SoX currently can |
|
|
648 |
read PCM, ULAW, ALAW, MS ADPCM, and IMA (or DVI) ADPCM. It |
|
|
649 |
can write all of these formats including __(NEW!)__ the |
|
|
650 |
ADPCM encoding. |
|
|
651 |
|
|
|
652 |
|
|
|
653 |
__.wve__ |
|
|
654 |
|
|
|
655 |
|
|
|
656 |
Psion 8-bit A-law |
|
|
657 |
These are 8-bit A-law 8khz sound files used on the Psion |
|
|
658 |
palmtop portable computer. |
|
|
659 |
|
|
|
660 |
|
|
|
661 |
__.raw__ |
|
|
662 |
|
|
|
663 |
|
|
|
664 |
Raw files (no header). |
|
|
665 |
The sample rate, size (byte, word, etc), and encoding |
|
|
666 |
(signed, unsigned, etc.) of the sample file must be given. |
|
|
667 |
The number of channels defaults to 1. |
|
|
668 |
|
|
|
669 |
|
|
|
670 |
__.ub, .sb, .uw, .sw, .ul, .al, .lu, .la, |
|
|
671 |
.sl__ |
|
|
672 |
|
|
|
673 |
|
|
|
674 |
These are several suffices which serve as a shorthand for |
|
|
675 |
raw files with a given size and encoding. Thus, __ub, sb, |
|
|
676 |
uw, sw, ul, al, lu, la__ and __sl__ correspond to |
|
|
677 |
__ |
|
|
678 |
|
|
|
679 |
|
|
|
680 |
__.auto__ |
|
|
681 |
|
|
|
682 |
|
|
|
683 |
This is a ``meta-type'': specifying this type for an input |
|
|
684 |
file triggers some code that tries to guess the real type by |
|
|
685 |
looking for magic words in the header. If the type can't be |
|
|
686 |
guessed, the program exits with an error message. The input |
|
|
687 |
must be a plain file, not a pipe. This type can't be used |
|
|
688 |
for output files. |
|
|
689 |
!!EFFECTS |
|
|
690 |
|
|
|
691 |
|
|
|
692 |
Multiple effects may be applied to the audio data by |
|
|
693 |
specifying them one after another at the end of the command |
|
|
694 |
line. |
|
|
695 |
|
|
|
696 |
|
|
|
697 |
avg [[ ''-l'' | ''-r'' | ''-f'' | ''-b'' | |
|
|
698 |
''n,n,...,n'' ] |
|
|
699 |
|
|
|
700 |
|
|
|
701 |
Reduce the number of channels by averaging the samples, or |
|
|
702 |
duplicate channels to increase the number of channels. This |
|
|
703 |
effect is automatically used when the number of input |
|
|
704 |
channels differ from the number of output channels. When |
|
|
705 |
reducing the number of channels it is possible to manually |
|
|
706 |
specify the avg effect and use the ''-l'', ''-r'', |
|
|
707 |
''-f'', or ''-b'' options to select only the left, |
|
|
708 |
right, front, or back channel(s) for the output instead of |
|
|
709 |
averaging the channels. The ''-f'' and ''-b'' options |
|
|
710 |
maintain left/right stereo separation; use the avg effect |
|
|
711 |
twice to select a single channel. |
|
|
712 |
|
|
|
713 |
|
|
|
714 |
The avg effect can also be invoked with up to 16 |
|
|
715 |
double-precision numbers, which specify the proportion of |
|
|
716 |
each input channel that is to be mixed into each output |
|
|
717 |
channel. In two-channel mode, 4 numbers are given: l- |
|
|
718 |
|
|
|
719 |
|
|
|
720 |
It is also possible to use the 16 numbers to expand or |
|
|
721 |
reduce the channel count; just specify 0 for unused |
|
|
722 |
channels. Finally, if fewer than 4 numbers are given, |
|
|
723 |
certain special abbreviations may be invoked; see the source |
|
|
724 |
code for details. |
|
|
725 |
|
|
|
726 |
|
|
|
727 |
band __[[__ ''-n'' __]__ ''center'' __[[__ |
|
|
728 |
''width'' __]__ |
|
|
729 |
|
|
|
730 |
|
|
|
731 |
Apply a band-pass filter. The frequency response drops |
|
|
732 |
logarithmically around the ''center'' frequency. The |
|
|
733 |
''width'' gives the slope of the drop. The frequencies at |
|
|
734 |
''center + width'' and ''center - width'' will be half |
|
|
735 |
of their original amplitudes. __Band__ defaults to a mode |
|
|
736 |
oriented to pitched signals, i.e. voice, singing, or |
|
|
737 |
instrumental music. The ''-n'' (for noise) option uses |
|
|
738 |
the alternate mode for un-pitched signals. __Warning:__ |
|
|
739 |
''-n'' introduces a power-gain of about 11dB in the |
|
|
740 |
filter, so beware of output clipping. __Band__ introduces |
|
|
741 |
noise in the shape of the filter, i.e. peaking at the |
|
|
742 |
''center'' frequency and settling around it. See |
|
|
743 |
__filter__ for a bandpass effect with steeper |
|
|
744 |
shoulders. |
|
|
745 |
|
|
|
746 |
|
|
|
747 |
bandpass ''frequency bandwidth'' |
|
|
748 |
|
|
|
749 |
|
|
|
750 |
Butterworth bandpass filter. Description coming |
|
|
751 |
soon! |
|
|
752 |
|
|
|
753 |
|
|
|
754 |
bandreject ''frequency bandwidth'' |
|
|
755 |
|
|
|
756 |
|
|
|
757 |
Butterworth bandreject filter. Description coming |
|
|
758 |
soon! |
|
|
759 |
|
|
|
760 |
|
|
|
761 |
chorus ''gain-in gain-out delay decay speed |
|
|
762 |
depth'' |
|
|
763 |
|
|
|
764 |
|
|
|
765 |
-s | ''-t [[ delay decay speed depth -s'' | ''-t ...'' |
|
|
766 |
] |
|
|
767 |
|
|
|
768 |
|
|
|
769 |
Add a chorus to a sound sample. Each quadtuple |
|
|
770 |
delay/decay/speed/depth gives the delay in milliseconds and |
|
|
771 |
the decay (relative to gain-in) with a modulation speed in |
|
|
772 |
Hz using depth in milliseconds. The modulation is either |
|
|
773 |
sinusoidal (-s) or triangular (-t). Gain-out is the volume |
|
|
774 |
of the output. |
|
|
775 |
|
|
|
776 |
|
|
|
777 |
compand |
|
|
778 |
''attack1,decay1''[[,''attack2,decay2''...] |
|
|
779 |
|
|
|
780 |
|
|
|
781 |
''in-dB1,out-dB1''[[,''in-dB2,out-dB2''...] |
|
|
782 |
|
|
|
783 |
|
|
|
784 |
[[''gain'' [[''initial-volume'' [[''delay'' ] ] |
|
|
785 |
] |
|
|
786 |
|
|
|
787 |
|
|
|
788 |
Compand (compress or expand) the dynamic range of a sample. |
|
|
789 |
The attack and decay time specify the integration time over |
|
|
790 |
which the absolute value of the input signal is integrated |
|
|
791 |
to determine its volume; attacks refer to increases in |
|
|
792 |
volume and decays refer to decreases. Where more than one |
|
|
793 |
pair of attack/decay parameters are specified, each channel |
|
|
794 |
is treated separately and the number of pairs must agree |
|
|
795 |
with the number of input channels. The second parameter is a |
|
|
796 |
list of points on the compander's transfer function |
|
|
797 |
specified in dB relative to the maximum possible signal |
|
|
798 |
amplitude. The input values must be in a strictly increasing |
|
|
799 |
order but the transfer function does not have to be |
|
|
800 |
monotonically rising. The special value ''-inf'' may be |
|
|
801 |
used to indicate that the input volume should be associated |
|
|
802 |
output volume. The points ''-inf,-inf'' and ''0,0'' |
|
|
803 |
are assumed; the latter may be overridden, but the former |
|
|
804 |
may not. |
|
|
805 |
|
|
|
806 |
|
|
|
807 |
The third (optional) parameter is a post-processing gain in |
|
|
808 |
dB which is applied after the compression has taken place; |
|
|
809 |
the fourth (optional) parameter is an initial volume to be |
|
|
810 |
assumed for each channel when the effect starts. This |
|
|
811 |
permits the user to supply a nominal level initially, so |
|
|
812 |
that, for example, a very large gain is not applied to |
|
|
813 |
initial signal levels before the companding action has begun |
|
|
814 |
to operate: it is quite probable that in such an event, the |
|
|
815 |
output would be severely clipped while the compander gain |
|
|
816 |
properly adjusts itself. |
|
|
817 |
|
|
|
818 |
|
|
|
819 |
The fifth (optional) parameter is a delay in seconds. The |
|
|
820 |
input signal is analyzed immediately to control the |
|
|
821 |
compander, but it is delayed before being fed to the volume |
|
|
822 |
adjuster. Specifying a delay approximately equal to the |
|
|
823 |
attack/decay times allows the compander to effectively |
|
|
824 |
operate in a |
|
|
825 |
|
|
|
826 |
|
|
|
827 |
copy |
|
|
828 |
|
|
|
829 |
|
|
|
830 |
Copy the input file to the output file. This is the default |
|
|
831 |
effect if both files have the same sampling |
|
|
832 |
rate. |
|
|
833 |
|
|
|
834 |
|
|
|
835 |
dcshift ''shift'' [[ ''limitergain'' ] |
|
|
836 |
|
|
|
837 |
|
|
|
838 |
DC Shift the audio data, with basic linear amplitude |
|
|
839 |
formula. This is most useful if your audio data tends to not |
|
|
840 |
be centered around a value of 0. Shifting it back will allow |
|
|
841 |
you to get the most volume adjustments without clipping |
|
|
842 |
audio data. |
|
|
843 |
The first option is the ''dcshift'' value. It is a |
|
|
844 |
floating point number that indicates the amount to |
|
|
845 |
shift. |
|
|
846 |
An option limtergain value can be specified as well. It |
|
|
847 |
should have a value much less then 1.0 and is used only on |
|
|
848 |
peaks to prevent clipping. |
|
|
849 |
|
|
|
850 |
|
|
|
851 |
deemph |
|
|
852 |
|
|
|
853 |
|
|
|
854 |
Apply a treble attenuation shelving filter to samples in |
|
|
855 |
audio cd format. The frequency response of pre-emphasized |
|
|
856 |
recordings is rectified. The filtering is defined in the |
|
|
857 |
standard document ISO 908. |
|
|
858 |
|
|
|
859 |
|
|
|
860 |
earwax |
|
|
861 |
|
|
|
862 |
|
|
|
863 |
Makes sound easier to listen to on headphones. Adds |
|
|
864 |
audio-cues to samples in audio cd format so that when |
|
|
865 |
listened to on headphones the stereo image is moved from |
|
|
866 |
inside your head (standard for headphones) to outside and in |
|
|
867 |
front of the listener (standard for speakers). See |
|
|
868 |
www.geocities.com/beinges for a full |
|
|
869 |
explanation. |
|
|
870 |
|
|
|
871 |
|
|
|
872 |
echo ''gain-in gain-out delay decay'' [[ ''delay decay |
|
|
873 |
...'' ] |
|
|
874 |
|
|
|
875 |
|
|
|
876 |
Add echoing to a sound sample. Each delay/decay part gives |
|
|
877 |
the delay in milliseconds and the decay (relative to |
|
|
878 |
gain-in) of that echo. Gain-out is the volume of the |
|
|
879 |
output. |
|
|
880 |
|
|
|
881 |
|
|
|
882 |
echos ''gain-in gain-out delay decay'' [[ ''delay decay |
|
|
883 |
...'' ] |
|
|
884 |
|
|
|
885 |
|
|
|
886 |
Add a sequence of echos to a sound sample. Each delay/decay |
|
|
887 |
part gives the delay in milliseconds and the decay (relative |
|
|
888 |
to gain-in) of that echo. Gain-out is the volume of the |
|
|
889 |
output. |
|
|
890 |
|
|
|
891 |
|
|
|
892 |
fade [[ ''type'' ] ''fade-in-length'' |
|
|
893 |
|
|
|
894 |
|
|
|
895 |
[[ ''stop-time'' [[ ''fade-out-length'' ] ] |
|
|
896 |
|
|
|
897 |
|
|
|
898 |
Add a fade effect to the beginning, end, or both of the |
|
|
899 |
audio data. |
|
|
900 |
|
|
|
901 |
|
|
|
902 |
For fade-ins, this starts from the first sample and ramps |
|
|
903 |
the volume of the audio from 0 to full volume over |
|
|
904 |
''fade-in-length'' seconds. Specify 0 seconds if no |
|
|
905 |
fade-in is wanted. |
|
|
906 |
|
|
|
907 |
|
|
|
908 |
For fade-outs, the audio data will be truncated at the |
|
|
909 |
stop-time and the volume will be ramped from full volume |
|
|
910 |
down to 0 starting at ''fade-out-length'' seconds before |
|
|
911 |
the ''stop-time''. No fade-out is performed if these |
|
|
912 |
options are not specified. |
|
|
913 |
All times can be specified in either periods of time or |
|
|
914 |
sample counts. To specify time periods use the format |
|
|
915 |
hh:mm:ss.frac format. To specify using sample counts, |
|
|
916 |
specify the number of samples and append the letter 's' to |
|
|
917 |
the sample count (for example 8000s). |
|
|
918 |
An optional ''type'' can be specified to change the type |
|
|
919 |
of envelope. Choices are q for quarter of a sinewave, h for |
|
|
920 |
half a sinewave, t for linear slope, l for logarithmic, and |
|
|
921 |
p for inverted parabola. The default is a linear |
|
|
922 |
slope. |
|
|
923 |
|
|
|
924 |
|
|
|
925 |
filter [[ ''low'' ]-[[ ''high'' ] [[ ''window-len'' [[ |
|
|
926 |
''beta'' ] ] |
|
|
927 |
|
|
|
928 |
|
|
|
929 |
Apply a Sinc-windowed lowpass, highpass, or bandpass filter |
|
|
930 |
of given window length to the signal. ''low'' refers to |
|
|
931 |
the frequency of the lower 6dB corner of the filter. |
|
|
932 |
''high'' refers to the frequency of the upper 6dB corner |
|
|
933 |
of the filter. |
|
|
934 |
|
|
|
935 |
|
|
|
936 |
A lowpass filter is obtained by leaving ''low'' |
|
|
937 |
unspecified, or 0. A highpass filter is obtained by leaving |
|
|
938 |
''high'' unspecified, or 0, or greater than or equal to |
|
|
939 |
the Nyquist frequency. |
|
|
940 |
|
|
|
941 |
|
|
|
942 |
The ''window-len'', if unspecified, defaults to 128. |
|
|
943 |
Longer windows give a sharper cutoff, smaller windows a more |
|
|
944 |
gradual cutoff. |
|
|
945 |
|
|
|
946 |
|
|
|
947 |
The ''beta'', if unspecified, defaults to 16. This |
|
|
948 |
selects a Kaiser window. You can select a Nuttall window by |
|
|
949 |
specifying anything |
|
|
950 |
''resample__ effect. |
|
|
951 |
|
|
|
952 |
|
|
|
953 |
flanger ''gain-in gain-out delay decay speed'' |
|
|
954 |
'' |
|
|
955 |
|
|
|
956 |
|
|
|
957 |
Add a flanger to a sound sample. Each triple |
|
|
958 |
delay/decay/speed gives the delay in milliseconds and the |
|
|
959 |
decay (relative to gain-in) with a modulation speed in Hz. |
|
|
960 |
The modulation is either sinodial (-s) or triangular (-t). |
|
|
961 |
Gain-out is the volume of the output. |
|
|
962 |
|
|
|
963 |
|
|
|
964 |
highp ''frequency'' |
|
|
965 |
|
|
|
966 |
|
|
|
967 |
Apply a single pole recursive high-pass filter. The |
|
|
968 |
frequency response drops logarithmically with I frequency in |
|
|
969 |
the middle of the drop. The slope of the filter is quite |
|
|
970 |
gentle. See __filter__ for a highpass effect with sharper |
|
|
971 |
cutoff. |
|
|
972 |
|
|
|
973 |
|
|
|
974 |
highpass ''frequency'' |
|
|
975 |
|
|
|
976 |
|
|
|
977 |
Butterworth highpass filter. Description coming |
|
|
978 |
soon! |
|
|
979 |
|
|
|
980 |
|
|
|
981 |
lowp ''frequency'' |
|
|
982 |
|
|
|
983 |
|
|
|
984 |
Apply a single pole recursive low-pass filter. The frequency |
|
|
985 |
response drops logarithmically with ''frequency'' in the |
|
|
986 |
middle of the drop. The slope of the filter is quite gentle. |
|
|
987 |
See __filter__ for a lowpass effect with sharper |
|
|
988 |
cutoff. |
|
|
989 |
|
|
|
990 |
|
|
|
991 |
lowpass ''frequency'' |
|
|
992 |
|
|
|
993 |
|
|
|
994 |
Butterworth lowpass filter. Description coming |
|
|
995 |
soon! |
|
|
996 |
|
|
|
997 |
|
|
|
998 |
map |
|
|
999 |
|
|
|
1000 |
|
|
|
1001 |
Display a list of loops in a sample, and miscellaneous loop |
|
|
1002 |
info. |
|
|
1003 |
|
|
|
1004 |
|
|
|
1005 |
mask |
|
|
1006 |
|
|
|
1007 |
|
|
|
1008 |
Add |
|
|
1009 |
|
|
|
1010 |
|
|
|
1011 |
pan ''direction'' |
|
|
1012 |
|
|
|
1013 |
|
|
|
1014 |
Pan the sound of an audio file from one channel to another. |
|
|
1015 |
This is done by changing the volume of the input channels so |
|
|
1016 |
that it fades out on one channel and fades-in on another. If |
|
|
1017 |
the number of input channels is different then the number of |
|
|
1018 |
output channels then this effect tries to intelligently |
|
|
1019 |
handle this. For instance, if the input contains 1 channel |
|
|
1020 |
and the output contains 2 channels, then it will create the |
|
|
1021 |
missing channel itself. The ''direction'' is a value from |
|
|
1022 |
-1.0 to 1.0. -1.0 represents far left and 1.0 represents far |
|
|
1023 |
right. Numbers in between will start the pan effect without |
|
|
1024 |
totally muting the opposite channel. |
|
|
1025 |
|
|
|
1026 |
|
|
|
1027 |
phaser ''gain-in gain-out delay decay speed'' |
|
|
1028 |
'' |
|
|
1029 |
|
|
|
1030 |
|
|
|
1031 |
Add a phaser to a sound sample. Each triple |
|
|
1032 |
delay/decay/speed gives the delay in milliseconds and the |
|
|
1033 |
decay (relative to gain-in) with a modulation speed in Hz. |
|
|
1034 |
The modulation is either sinodial (-s) or triangular (-t). |
|
|
1035 |
The decay should be less than 0.5 to avoid feedback. |
|
|
1036 |
Gain-out is the volume of the output. |
|
|
1037 |
|
|
|
1038 |
|
|
|
1039 |
pick [[ ''-1'' | ''-2'' | ''-3'' | ''-4'' | |
|
|
1040 |
''-l'' | ''-r'' ] |
|
|
1041 |
|
|
|
1042 |
|
|
|
1043 |
Select the left or right channel of a stereo sample, or one |
|
|
1044 |
of four channels in a quadraphonic sample. The ''-l'' and |
|
|
1045 |
''-r'' options represent either the left or right |
|
|
1046 |
channel. It is required that you use the __-c 1__ command |
|
|
1047 |
line option in order to force the output file to contain |
|
|
1048 |
only 1 channel. |
|
|
1049 |
|
|
|
1050 |
|
|
|
1051 |
pitch ''shift [[ width interpole fade ]'' |
|
|
1052 |
|
|
|
1053 |
|
|
|
1054 |
Change the pitch of file without affecting its duration by |
|
|
1055 |
cross-fading shifted samples. ''shift'' is given in |
|
|
1056 |
cents. Use a positive value to shift to treble, negative |
|
|
1057 |
value to shift to bass. Default shift is 0. ''width'' of |
|
|
1058 |
window is in ms. Default width is 20ms. Try 30ms to lower |
|
|
1059 |
pitch, and 10ms to raise pitch. ''interpole'' option, can |
|
|
1060 |
be |
|
|
1061 |
''fade'' option, can be |
|
|
1062 |
'' |
|
|
1063 |
|
|
|
1064 |
|
|
|
1065 |
polyphase [[ ''-w'' ''nut'' / ''ham'' |
|
|
1066 |
'' |
|
|
1067 |
|
|
|
1068 |
|
|
|
1069 |
[[ ''-width'' ''long'' / ''short'' / ''#'' |
|
|
1070 |
'' |
|
|
1071 |
|
|
|
1072 |
|
|
|
1073 |
[[ ''-cutoff #'' ] |
|
|
1074 |
|
|
|
1075 |
|
|
|
1076 |
Translate input sampling rate to output sampling rate via |
|
|
1077 |
polyphase interpolation, a DSP algorithm. This method is |
|
|
1078 |
slow and uses lots of RAM, but gives much better results |
|
|
1079 |
than __rate.__ |
|
|
1080 |
|
|
|
1081 |
|
|
|
1082 |
-w |
|
|
1083 |
nut.'' |
|
|
1084 |
|
|
|
1085 |
|
|
|
1086 |
-width long / short / # : specify the (approximate) width of |
|
|
1087 |
the filter. ''long'' is 1024 samples; ''short'' is 128 |
|
|
1088 |
samples. Alternatively, an exact number can be used. Default |
|
|
1089 |
is ''long.'' The ''short'' option is __not__ |
|
|
1090 |
recommended, as it produces poor quality |
|
|
1091 |
results. |
|
|
1092 |
|
|
|
1093 |
|
|
|
1094 |
-cutoff # : specify the filter cutoff frequency in terms of |
|
|
1095 |
fraction of frequency bandwidth, also know as the Nyquist |
|
|
1096 |
frequency. Please see the ''resample'' effect for further |
|
|
1097 |
information on Nyquist frequency. If upsampling, then this |
|
|
1098 |
is the fraction of the original signal that should go |
|
|
1099 |
through. If downsampling, this is the fraction of the signal |
|
|
1100 |
left after downsampling. Default is 0.95. Remember that this |
|
|
1101 |
is a float. |
|
|
1102 |
|
|
|
1103 |
|
|
|
1104 |
rate |
|
|
1105 |
|
|
|
1106 |
|
|
|
1107 |
Translate input sampling rate to output sampling rate via |
|
|
1108 |
linear interpolation to the Least Common Multiple of the two |
|
|
1109 |
sampling rates. This is the default effect if the two files |
|
|
1110 |
have different sampling rates and the preview options was |
|
|
1111 |
specified. This is fast but noisy: the spectrum of the |
|
|
1112 |
original sound will be shifted upwards and duplicated |
|
|
1113 |
faintly when up-translating by a multiple. |
|
|
1114 |
|
|
|
1115 |
|
|
|
1116 |
Lerp-ing is acceptable for cheap 8-bit sound hardware, but |
|
|
1117 |
for CD-quality sound you should instead use either |
|
|
1118 |
__resample__ or __polyphase.__ If you are wondering |
|
|
1119 |
which rate changing effects to use, you will want to read a |
|
|
1120 |
detailed analysis of all of them at |
|
|
1121 |
http://eakaw2.et.tu-dresden.de/~wilde/resample/resample.html |
|
|
1122 |
|
|
|
1123 |
|
|
|
1124 |
resample [[ ''-qs'' __|__ ''-q'' __|__ ''-ql'' |
|
|
1125 |
__] [[__ ''rolloff'' __[[__ ''beta'' __] |
|
|
1126 |
]__ |
|
|
1127 |
|
|
|
1128 |
|
|
|
1129 |
Translate input sampling rate to output sampling rate via |
|
|
1130 |
simulated analog filtration. This method is slower than |
|
|
1131 |
__rate,__ but gives much better results. |
|
|
1132 |
|
|
|
1133 |
|
|
|
1134 |
By default, linear interpolation is used, with a window |
|
|
1135 |
width about 45 samples at the lower of the two rate. This |
|
|
1136 |
gives an accuracy of about 16 bits, but insufficient |
|
|
1137 |
stopband rejection in the case that you want to have rolloff |
|
|
1138 |
greater than about 0.80 of the Nyquist |
|
|
1139 |
frequency. |
|
|
1140 |
|
|
|
1141 |
|
|
|
1142 |
The ''-q*'' options will change the default values for |
|
|
1143 |
rolloff and beta as well as use quadratic interpolation of |
|
|
1144 |
filter coefficients, resulting in about 24 bits precision. |
|
|
1145 |
The ''-qs'', ''-q'', or ''-ql'' options specify |
|
|
1146 |
increased accuracy at the cost of lower execution speed. It |
|
|
1147 |
is optional to specify rolloff and beta parameters when |
|
|
1148 |
using the ''-q*'' options. |
|
|
1149 |
|
|
|
1150 |
|
|
|
1151 |
Following is a table of the reasonable defaults which are |
|
|
1152 |
built-in to SoX: |
|
|
1153 |
|
|
|
1154 |
|
|
|
1155 |
__Option Window rolloff beta interpolation |
|
|
1156 |
------ ------ ------- ---- -------------__ |
|
|
1157 |
(none) 45 0.80 16 linear |
|
|
1158 |
-qs 45 0.80 16 quadratic |
|
|
1159 |
-q 75 0.875 16 quadratic |
|
|
1160 |
-ql 149 0.94 16 quadratic__ |
|
|
1161 |
------ ------ ------- ---- -------------__ |
|
|
1162 |
|
|
|
1163 |
|
|
|
1164 |
''-qs'', ''-q'', or ''-ql'' use window lengths of |
|
|
1165 |
45, 75, or 149 samples, respectively, at the lower |
|
|
1166 |
sample-rate of the two files. This means progressively |
|
|
1167 |
sharper stop-band rejection, at proportionally slower |
|
|
1168 |
execution times. |
|
|
1169 |
|
|
|
1170 |
|
|
|
1171 |
''rolloff'' refers to the cut-off frequency of the low |
|
|
1172 |
pass filter and is given in terms of the Nyquist frequency |
|
|
1173 |
for the lower sample rate. rolloff therefore should be |
|
|
1174 |
something between 0.0 and 1.0, in practice 0.8-0.95. The |
|
|
1175 |
defaults are indicated above. |
|
|
1176 |
|
|
|
1177 |
|
|
|
1178 |
The ''Nyquist frequency'' is equal to (sample rate / 2). |
|
|
1179 |
Logically, this is because the A/D converter needs at least |
|
|
1180 |
2 samples to detect 1 cycle at the Nyquist frequency. |
|
|
1181 |
Frequencies higher then the Nyquist will actually appear as |
|
|
1182 |
lower frequencies to the A/D converter and is called |
|
|
1183 |
aliasing. Normally, A/D converts run the signal through a |
|
|
1184 |
highpass filter first to avoid these problems. |
|
|
1185 |
|
|
|
1186 |
|
|
|
1187 |
Similar problems will happen in software when reducing the |
|
|
1188 |
sample rate of an audio file (frequencies above the new |
|
|
1189 |
Nyquist frequency can be aliased to lower frequencies). |
|
|
1190 |
Therefore, a good resample effect will remove all frequency |
|
|
1191 |
information above the new Nyquist frequency. |
|
|
1192 |
|
|
|
1193 |
|
|
|
1194 |
The ''rolloff'' refers to how close to the Nyquist |
|
|
1195 |
frequency this cutoff is, with closer being better. When |
|
|
1196 |
increasing the sample rate of an audio file you would not |
|
|
1197 |
expect to have any frequencies exist that are past the |
|
|
1198 |
original Nyquist frequency. Because of resampling |
|
|
1199 |
properties, it is common to have alaising data created that |
|
|
1200 |
is above the old Nyquist frequency. In that case the |
|
|
1201 |
''rolloff'' refers to how close to the original Nyquist |
|
|
1202 |
frequency to use a highpass filter to remove this false |
|
|
1203 |
data, with closer also being better. |
|
|
1204 |
|
|
|
1205 |
|
|
|
1206 |
The ''beta'' parameter determines the type of filter |
|
|
1207 |
window used. Any value greater than 2.0 is the beta for a |
|
|
1208 |
Kaiser window. Beta |
|
|
1209 |
'' |
|
|
1210 |
|
|
|
1211 |
|
|
|
1212 |
In the case of Kaiser window (beta |
|
|
1213 |
|
|
|
1214 |
|
|
|
1215 |
This is the default effect if the two files have different |
|
|
1216 |
sampling rates. Default parameters are, as indicated above, |
|
|
1217 |
Kaiser window of length 45, rolloff 0.80, beta 16, linear |
|
|
1218 |
interpolation. |
|
|
1219 |
|
|
|
1220 |
|
|
|
1221 |
__NOTE:__ ''-qs'' is only slightly slower, but more |
|
|
1222 |
accurate for 16-bit or higher precision. |
|
|
1223 |
|
|
|
1224 |
|
|
|
1225 |
__NOTE:__ In many cases of up-sampling, no interpolation |
|
|
1226 |
is needed, as exact filter coefficients can be computed in a |
|
|
1227 |
reasonable amount of space. To be precise, this is done |
|
|
1228 |
when |
|
|
1229 |
|
|
|
1230 |
|
|
|
1231 |
input_rate |
|
|
1232 |
output_rate/gcd(input_rate,output_rate) |
|
|
1233 |
|
|
|
1234 |
|
|
|
1235 |
reverb ''gain-out delay'' [[ ''delay ...'' |
|
|
1236 |
] |
|
|
1237 |
|
|
|
1238 |
|
|
|
1239 |
Add reverberation to a sound sample. Each delay is given in |
|
|
1240 |
milliseconds and its feedback is depending on the |
|
|
1241 |
reverb-time in milliseconds. Each delay should be in the |
|
|
1242 |
range of half to quarter of reverb-time to get a realistic |
|
|
1243 |
reverberation. Gain-out is the volume of the |
|
|
1244 |
output. |
|
|
1245 |
|
|
|
1246 |
|
|
|
1247 |
reverse |
|
|
1248 |
|
|
|
1249 |
|
|
|
1250 |
Reverse the sound sample completely. Included for finding |
|
|
1251 |
Satanic subliminals. |
|
|
1252 |
|
|
|
1253 |
|
|
|
1254 |
__silence__ ''above_periods'' [[ ''duration |
|
|
1255 |
threshold''[[ ''d'' | ''%'' ] |
|
|
1256 |
|
|
|
1257 |
|
|
|
1258 |
[[ ''below_periods duration'' |
|
|
1259 |
|
|
|
1260 |
|
|
|
1261 |
threshold[[ ''d'' | ''%'' ]] |
|
|
1262 |
|
|
|
1263 |
|
|
|
1264 |
Removes silence from the beginning or end of a sound file. |
|
|
1265 |
Silence is anything below a specified threshold. |
|
|
1266 |
When trimming silence from the beginning of a sound file, |
|
|
1267 |
you specify a duration of audio that is above a given |
|
|
1268 |
silence threshold before audio data is processed. You can |
|
|
1269 |
also specify the count of periods of none silence you want |
|
|
1270 |
to detect before processing audio data. Specify a period of |
|
|
1271 |
0 if you do not want to trim data from the front of the |
|
|
1272 |
sound file. |
|
|
1273 |
When optionally trimming silence form the end of a sound |
|
|
1274 |
file, you specify the duration of audio that must be below a |
|
|
1275 |
given threshold before stopping to process audio data. A |
|
|
1276 |
count of periods that occur below the threshold may also be |
|
|
1277 |
specified. If this options are not specified then data is |
|
|
1278 |
not trimmed from the end of the audio file. |
|
|
1279 |
Duration counts may be in the format of time, hh:mm:ss.frac, |
|
|
1280 |
or in the exact count of samples. |
|
|
1281 |
Threshold may be suffixed with d, or % to indicated the |
|
|
1282 |
value is in decibels or a percentage of max value of the |
|
|
1283 |
sample value. A value of '0%' will look for total |
|
|
1284 |
silence. |
|
|
1285 |
|
|
|
1286 |
|
|
|
1287 |
speed [[ -c ] ''factor'' |
|
|
1288 |
|
|
|
1289 |
|
|
|
1290 |
Speed up or down the sound, as a magnetic tape with a speed |
|
|
1291 |
control. It affects both pitch and time. A factor of 1.0 |
|
|
1292 |
means no change, and is the default. 2.0 doubles speed, thus |
|
|
1293 |
time length is cut by a half and pitch is one octave higher. |
|
|
1294 |
0.5 halves speed thus time length doubles and pitch is one |
|
|
1295 |
octave lower. If the optional -c parameter is used then the |
|
|
1296 |
factor is specified in |
|
|
1297 |
|
|
|
1298 |
|
|
|
1299 |
split |
|
|
1300 |
|
|
|
1301 |
|
|
|
1302 |
Turn a mono sample into a stereo sample by copying the input |
|
|
1303 |
channel to the left and right channels. |
|
|
1304 |
|
|
|
1305 |
|
|
|
1306 |
stat [[ ''-s n'' __] [[__''-rms'' __] [[__ |
|
|
1307 |
''-v'' __] [[__ ''-d'' __]__ |
|
|
1308 |
|
|
|
1309 |
|
|
|
1310 |
Do a statistical check on the input file, and print results |
|
|
1311 |
on the standard error file. Audio data is passed unmodified |
|
|
1312 |
from input to output file unless used along with the |
|
|
1313 |
__-e__ option. |
|
|
1314 |
|
|
|
1315 |
|
|
|
1316 |
The |
|
|
1317 |
-v__ ''number'' which |
|
|
1318 |
will make the sample as loud as possible without |
|
|
1319 |
clipping. |
|
|
1320 |
|
|
|
1321 |
|
|
|
1322 |
The option __-v__ will print out the |
|
|
1323 |
__ |
|
|
1324 |
|
|
|
1325 |
|
|
|
1326 |
The __-s n__ option is used to scale the input data by a |
|
|
1327 |
given factor. The default value of n is the max value of a |
|
|
1328 |
signed long variable (0x7fffffff). Internal effects always |
|
|
1329 |
work with signed long PCM data and so the value should |
|
|
1330 |
relate to this fact. |
|
|
1331 |
|
|
|
1332 |
|
|
|
1333 |
The __-rms__ option will convert all output average |
|
|
1334 |
values to ''root mean square'' format. |
|
|
1335 |
|
|
|
1336 |
|
|
|
1337 |
There is also an optional parameter __-d__ that will |
|
|
1338 |
print out a hex dump of the sound file from the internal |
|
|
1339 |
buffer that is in 32-bit signed PCM data. This is mainly |
|
|
1340 |
only of use in tracking down endian problems that creep in |
|
|
1341 |
to SoX on cross-platform versions. |
|
|
1342 |
|
|
|
1343 |
|
|
|
1344 |
stretch ''factor [[window fade shift |
|
|
1345 |
fading]'' |
|
|
1346 |
|
|
|
1347 |
|
|
|
1348 |
Time stretch file by a given factor. Change duration without |
|
|
1349 |
affecting the pitch. ''factor'' of stretching: |
|
|
1350 |
''window'' size is in |
|
|
1351 |
ms. Default is 20ms. The ''fade'' option, can be |
|
|
1352 |
''shift'' ratio, in [[0.0 1.0]. Default |
|
|
1353 |
depends on stretch factor. 1.0 to shorten, 0.8 to lengthen. |
|
|
1354 |
The ''fading'' ratio, in [[0.0 0.5]. The amount of a |
|
|
1355 |
fade's default depends on factor and shift. |
|
|
1356 |
|
|
|
1357 |
|
|
|
1358 |
swap [[ ''1 2'' __|__ ''1 2 3 4'' |
|
|
1359 |
__]__ |
|
|
1360 |
|
|
|
1361 |
|
|
|
1362 |
Swap channels in multi-channel sound files. Optionally, you |
|
|
1363 |
may specify the channel order you would like the output in. |
|
|
1364 |
This defaults to output channel 2 and then 1 for stereo and |
|
|
1365 |
2, 1, 4, 3 for quad-channels. An interesting feature is that |
|
|
1366 |
you may duplicate a given channel by overwriting another. |
|
|
1367 |
This is done by repeating an output channel on the command |
|
|
1368 |
line. For example, swap 2 2 will overwrite channel 1 with |
|
|
1369 |
channel 2's data; creating a stereo file with both channels |
|
|
1370 |
containing the same audio data. |
|
|
1371 |
|
|
|
1372 |
|
|
|
1373 |
synth [[ ''length'' ] ''type mix'' [[ ''freq'' [[ |
|
|
1374 |
''-freq2'' ] |
|
|
1375 |
|
|
|
1376 |
|
|
|
1377 |
[[ ''off'' ] [[ ''ph'' ] [[ ''p1'' ] [[ ''p2'' ] [[ |
|
|
1378 |
''p3'' ] |
|
|
1379 |
|
|
|
1380 |
|
|
|
1381 |
The synth effect will generate various types of audio data. |
|
|
1382 |
Although this effect is used to generate audio data, an |
|
|
1383 |
input file must be specified. The length of the input audio |
|
|
1384 |
file determines the length of the output audio file. |
|
|
1385 |
|
|
|
1386 |
|
|
|
1387 |
trim ''start'' [[ ''length'' ] |
|
|
1388 |
|
|
|
1389 |
|
|
|
1390 |
Trim can trim off unwanted audio data from the beginning and |
|
|
1391 |
end of the audio file. Audio samples are not sent to the |
|
|
1392 |
output stream until the ''start'' location is |
|
|
1393 |
reached. |
|
|
1394 |
The optional ''length'' parameter tells the number of |
|
|
1395 |
samples to output after the ''start'' sample and is used |
|
|
1396 |
to trim off the back side of the audio data. Using a value |
|
|
1397 |
of 0 for the ''start'' parameter will allow trimming off |
|
|
1398 |
the back side only. |
|
|
1399 |
Both options can be specified using either an amount of time |
|
|
1400 |
and an exact count of samples. The format for specifying |
|
|
1401 |
lengths in time is hh:mm:ss.frac. A start value of 1:30.5 |
|
|
1402 |
will not start until 1 minute, thirty and 1/2 seconds into |
|
|
1403 |
the audio data. The format for specifying sample counts is |
|
|
1404 |
the number of samples with the letter 's' appended to it. A |
|
|
1405 |
value of 8000s will wait until 8000 samples are read before |
|
|
1406 |
starting to process audio data. |
|
|
1407 |
|
|
|
1408 |
|
|
|
1409 |
vibro ''speed'' __[[__ ''depth'' |
|
|
1410 |
__]__ |
|
|
1411 |
|
|
|
1412 |
|
|
|
1413 |
Add the world-famous Fender Vibro-Champ sound effect to a |
|
|
1414 |
sound sample by using a sine wave as the volume knob. |
|
|
1415 |
__Speed__ gives the Hertz value of the wave. This must be |
|
|
1416 |
under 30. __Depth__ gives the amount the volume is cut |
|
|
1417 |
into by the sine wave, ranging 0.0 to 1.0 and defaulting to |
|
|
1418 |
0.5. |
|
|
1419 |
|
|
|
1420 |
|
|
|
1421 |
vol ''gain'' [[ ''type'' __[[__ ''limitergain'' ] |
|
|
1422 |
] |
|
|
1423 |
|
|
|
1424 |
|
|
|
1425 |
The vol effect is much like the command line option -v. It |
|
|
1426 |
allows you to adjust the volume of an input file and allows |
|
|
1427 |
you to specify the adjustment in relation to amplitude, |
|
|
1428 |
power, or dB. If ''type'' is not specified then it |
|
|
1429 |
defaults to ''amplitude''. |
|
|
1430 |
When type is ''amplitude'' then a linear change of the |
|
|
1431 |
amplitude is performed based on the gain. Therefore, a value |
|
|
1432 |
of 1.0 will keep the volume the same, 0.0 to |
|
|
1433 |
'' |
|
|
1434 |
When type is ''power'' then a value of 1.0 also means no |
|
|
1435 |
change in volume. |
|
|
1436 |
When type is ''dB'' the amplitude is changed |
|
|
1437 |
logarithmically. 0.0 is constant while +6 doubles the |
|
|
1438 |
amplitude. |
|
|
1439 |
An optional ''limitergain'' value can be specified and |
|
|
1440 |
should be a value much less then 1.0 (ie 0.05 or 0.02) and |
|
|
1441 |
is used only on peaks to prevent clipping. Not specifying |
|
|
1442 |
this parameter will cause no limiter to be used. In verbose |
|
|
1443 |
mode, this effect will display the percentage of audio data |
|
|
1444 |
that needed to be limited. |
|
|
1445 |
!!BUGS |
|
|
1446 |
|
|
|
1447 |
|
|
|
1448 |
The syntax is horrific. Thats the breaks when trying to |
|
|
1449 |
handle all things from the command line. |
|
|
1450 |
|
|
|
1451 |
|
|
|
1452 |
Please report any bugs found in this version of SoX to Chris |
|
|
1453 |
Bagwell (cbagwell@sprynet.com) |
|
|
1454 |
!!FILES |
|
|
1455 |
!!SEE ALSO |
|
|
1456 |
|
|
|
1457 |
|
|
|
1458 |
play(1), rec(1), |
|
|
1459 |
__soxexam(1)__ |
|
|
1460 |
!!NOTICES |
|
|
1461 |
|
|
|
1462 |
|
|
|
1463 |
The version of SoX that accompanies this manual page is |
|
|
1464 |
support by Chris Bagwell (cbagwell@users.sourceforge.net). |
|
|
1465 |
Please refer any questions regarding it to this address. You |
|
|
1466 |
may obtain the latest version at the the web site |
|
|
1467 |
http://sox.sourceforge.net/ |
|
|
1468 |
!!AUTHOR |
|
|
1469 |
|
|
|
1470 |
|
|
|
1471 |
Chris Bagwell (cbagwell@users.sourceforge.net). |
|
|
1472 |
|
|
|
1473 |
|
|
|
1474 |
Updates by Anonymous |
|
|
1475 |
---- |