Any Time v.1.1
Time - Pitch - Sample rate converter
Try the free shareware.
Buy the full version for $50 / €45.
Runs under: Windows 10 / 8 / 7 / Vista / XP.
Step 1 — Input files
Step 2 — Processing options
Step 3 — Output options
Step 4 — Calculate output
Any Time lets you independently take control of the duration, pitch, and sample rate of an audio recording.
- Do time stretching — Change the playback time of the recording without changing the pitch.
- Do pitch scaling — Change the pitch of the recording without changing the playback time. With optional formant correction to remove the "Mickey mouse" effect.
- Do sample rate conversion — Change the sample rate without changing either the pitch or the playback time.
- Do any combination imaginable of the operations above!
- Recreate missing high-frequency components — Optionally synthesize "overtones" into any empty upper frequency range (e.g. when 'upsampling' or when scaling down the pitch) using a proprietary high frequency extrapolation algorithm.
- Professional mastering functions such as automatic psycho-acoustically corrected and heuristics based volume normalization and high order professional noise-shaped (inverse F-weighted) dithering of the output.
- Extremely high precision algorithms used — Audio quality has always been selected before computation cost.
- Frequency analysis tool with an advanced note calculator helps you easily calculate pitch adjustments.
- Supports many common file formats.
- Handles audiophile quality audio formats — E.g. 24-bit 192-KHz multi-channel audio at 64-bit floating point precision.
- Process hundreds of files as easily as you do a single file thanks to the batch processing support.
Also see the Audio quality page for a few more in-depth descriptions…
What is not not?
- Most importantly, it is not fast. For longer recordings, you probably want to leave it running over night. The idea here is to get the very best possible audio quality regardless of cost in CPU time.
Awards & Reviews…
Examples of how you can use it:
- Use the high quality sample rate conversion to convert between the 44.1 kHz sample rate of audio CD's and the 48 kHz sample common with DVD discs and DAT tapes.
- Use high frequency extrapolation to upsample your existing 44.1 and 48 kHz recordings to 96 kHz for playback on e.g. DVD or DVD-Audio players — No need for special upsampling hardware (no current hardware could throw this much CPU power at the problem anyway).
- Use high frequency extrapolation to improve the perceived quality of low bandwidth audio recordings.
- Use time stretching to solve the "24 fps movie played on 25 fps PAL TV" audio pitch problem (if the audio is simply played at 25/24 of the original speed, then it gets perceivably out of tune).
- Use time stretching to fit already produces recordings to video clips of slightly different length.
- Use pitch scaling to correct pitch inaccuracies in recordings.
- Use pitch scaling to produce many different special effects (e.g. to change the musical key).
Any Time - Step 1: Select input files
Any Time - Step 2: Select processing options
Any Time - Step 3: Select output options
Any Time - Step 4: Calculate output data
There are many algorithms for performing sample rate conversion of an audio recording (a.k.a. resampling). These vary widely in complexity and in audio quality. The simplest (e.g. 'polynomial interpolation algorithms' commonly used in synthesizers) are CPU-efficient but do not perform any filtering of the sound before downsampling (converting from a higher to a lower sample rate). Frequencies left from the original recording above the frequency-limit of the downsampled recording will result in very undesirable 'aliasing noise'. It is therefore very important to filter out those high frequencies before downsampling. The ideal filter would be a 'brick filter' that cuts off everything above the new frequency-limit and leaves everything below it intact. Unfortunately such a filter is not practical to design — every real world filter will have a 'slope' at the cut-off frequency. While you don't want to leave any information above the new frequency limit, you don't want to remove any more than necessary of the frequencies below the limit either. This is one of the two challenges of sample rate conversion. The other is minimizing distortion. For both problems, in general you can say that the more CPU power you throw at the problem, the better audio quality you can get. Let's have a look at a real world example:
Here white noise (noise with a flat spectrum) generated at 48000 Hz has first been downsampled to 32000 Hz (which means a frequency limit of 16 KHz) and then upsampled to 48000 Hz again using various algorithms. All calculations have been done using 32-bit floating in order to avoid quantization issues.
- The blue line is the original spectrum with no resampling.
- The yellow line shows a typical implementation with a nicely designed trade-off between computation time and audio quality for resampling 16-bit audio data. This algorithm can be run in real-time. The yellow 'hills' above 16000 Hz are aliasing noise, but note that they are all below -96dB and so would be entirely cut away if the sound were to be converted to 16-bit PCM format — which is what the implemenation was intended for. Also note the muting of high frequencies above 14000 Hz. This is actually a fairly well designed filter and many an audio software does much, much worse than this - but no names mentioned!
- The red line shows the high-quality resampling algorithm implementation of a very well known audio editor. Not much to comment on really, a nice steep cut-off and no aliasing noise.
- The green line finally, shows the results with Any Time. The filter cut-off is extremely steep — thus preserving as much as possible of the original sound - and there is no aliasing noise visible. Of course it also takes much longer time to compute than the other algorithms It should be noted that the cut-off is in reality much steeper than what is shown here - more than 80% of the slope for the green line is simply due to a side-effect of the analysis algorithm used to generate the graph (spectral spreading due to finite Fourier-analysis window length). This is also the explanation as to why both the green and red lines seem to extend above 16000 Hz — in reality they don't.
High frequency range extrapolation:
In many applications a low sample rate is used as a necessity for reducing file size or transmission band-width requirements (e.g. normal phone-switches use 8000 Hz). The idea of trying to 'recreate' lost higher frequency components is not new, but it is not an easy problem. Of course, information theory says that this is impossible in the 'generic' case where nothing is assumed about the properties of the signal. But it is certainly possible to use different heuristic and statistical methods to try to come up with an algorithm for high frequency extrapolation (a.k.a. synthetic bandwidth extension) that subjectively improves the sound even though it may not provide a perfect 'recreation' of the original. A well know example is the 'Spectral Band Replication' algorithm (SBR) used in 'mp3pro' and 'AAC+'. There does not seem to be much released information about how the SBR algorithm actually works, but from what public information we have been able to obtain we guess that it works by matching the low-frequency spectrum against a database of spectrums collected from many types of material. The decoder then selects and adds the high-frequency components from one entry in the database. The file would then add 'hints' that do not take up much space, but which helps the decoder make a better selection from the database. One potential problem with this approach is that it may not ensure a harmonic relationship between the added high frequency components and the 'real' lower frequency spectrum (musical tones tend to have harmonic overtones — that's what makes them sound good to our ears). The proprietary algorithm that we have developed for Any Time use an entirely different approach. Very broadly speaking it, analyzes the existing frequency spectrum and tries to identify the 'fundamental frequencies' of the sound sources (much like our instrument tuner software), it then adds harmonic series of overtones. The real tricky issue ishow to identify these frequencies, how to determine the proper amplitudes for the overtones, how to handle non-harmonic contents, and how to do this in a manner that is stable over time.
The graph shows a 44100 Hz synth pad that has first been downsampled to 8000 Hz (i.e. only frequencies below 4000 Hz have been retained from the original) and then 'restored' to 44100 Hz using Any Time's high frequency range extrapolation option.
- The blue line is the original spectrum — from this only the frequencies below 4000 Hz were passed on to Any Time.
- The green-line is the reconstruction by Any Time. The results are quite satisfactory, especially considering that the algorithm only had 18% of the original frequency content to work with. Deviations increase of course with the extension 'range' — it is apparent from the graph that the reconstruction in this case is closest to the original in the first octave above the 4000 Hz frequency limit (i.e. the 4000-8000 Hz range), while in the second octave (8000-16000) Hz the peaks are still fairly well aligned but the amplitudes no longer match as well. Interestingly the amplitudes are good in the final 3rd octave even though the original spectrum shape there has very little similarity to the spectrum below 4000 Hz! Anyhow, the important thing is how does it sound? Rather than put up sample clips here, we are confident enough to invite you to download and try the software yourself for up to 30 days!
Pitch scaling and formant correction:
Pitch scaling moves the 'pitch' of a recording up or down without changing the sample rate or the recording length. This is done by breaking down the sound into separate frequency components, scaling them by the desired factor, then re-synthesizing' the sound. Simply changing the playback speed has the same effect on the pitch - but that also changes the playback time by the same factor as the pitch. One side-effect both have in common is the so called "Mickey mouse" effect — i.e. speech (and music too) sound 'tinny' when pitched up, or 'boomy' when pitched down. This happens because most instruments — the human voice included — works by in one way or another exciting a 'resonance body' and this resonance body has a set of 'resonance frequencies' near which sound is better amplified. This is seen as 'hills', known as 'formants', if you look at a frequency-amplitude graph. When playing notes of different pitches on an instrument (or whatever is producing the sound), the resonance frequencies will remain fixed. But when pitch-scaling, you will also change the resonance frequencies by the same factor as you scale all the frequency components — this is why the 'character' of the instrument changes (e.g. from a normal voice character to a Mickey-mouse character voice). The solution is to apply what is commonly called 'formant correction'. This entails somehow analyzing the 'frequency envelope' of the sound (the overall 'shape' of the frequency graph — a good example BTW of something that is much, much easier for a human to do than for a computer) and then re-enforcing it on the pitch-scaled sound. The result is very often a sound with a much more 'natural' character!
The graph shows a short segment of a violin recording that has been pitch scaled by a factor 1.2x.
- The blue line is the original (unscaled) spectrum.
- The yellow line is the spectrum after scaling in Any Time without formant correction.
- The green line is the spectrum after scaling in Any Time with formant correction. It much more closely follows the 'envelope' of the blue line.
Time stretching changes the length of a recording without changing the sample rate or the pitch of the sound. This is equivalent to doing both pitch scaling and resampling at the same time — so the same comments apply as for those two operations. A mathematical necessity when time-stretching, or when pitch scaling, is that different frequencies get different amounts of 'phase shift' when they are scaled. When time-stretching by non-integer factors this may sometimes be especially noticeable because an you can get a pronounced 'wah-wah' effect (amplitude modulation due to cancellation). Any Time employs two methods to try to deal with this, one is a feature to 're-synchronize' the phase at larger amplitude increases. The other is an optional feature to 'enforce the original volume envelope'. This will analyze the overall 'volume envelope' of the original recording and 'force' this back onto the processed audio — much like the formant correction but in the time domain instead of in the frequency domain. The side effect of that is that it will not preserve the spectral characteristics of the sound since 'soft frequencies components' are lifted up to compensate whenever 'loud frequency components' cancel each other out. If you prefer this, or not, is a personal choice — but first try without this option and if you are bothered by a 'wah-wah', then try turning it on.
In the high quality audio studios of today, recording, mixing and effects processing are often done at a high sample rate and bit depth (e.g. 96000 Hz, 32-bit floating point). When 'mastering it', i.e. preparing it for the final lower resolution distribution medium (e.g. 44100 Hz, 16-bit audio CD format), it is important to get everything right in order to maintain maximum audio quality.
- Volume normalization: When preparing a CD or other 'package', you are often given sound clips with different 'loudness'. The relative volume of each clip then needs to be carefully adjusted so that the listener wont have to jump to the volume control every so often (e.g. at each track change on a CD). Any Time provides a heuristics algorithm for automatically performing such 'volume normalization' that tries to achieve a result very close to what a sound engineer would do. The algorithm also employs psycho-acoustical corrections for the ears different sensitivity for different frequencies. The same algorithm as is used in Any Time was also selected by a large national radio station after evaluating several different methods.
- Noise floor clean-up: If you have a noisy recording, then Any Time can selectively filter out any frequency components with amplitudes lower than a selectable thresh-hold value. This preserves the full resolution of all other frequency components.
- Noise shaped dithering: Bit-depth quantization is the process of constructing output sample value numbers from incoming, often higher precision data (e.g. 24-bit to 16-bit PCM). The internal processing in Any Time is always done at a precision of 64-bit floating point and the processing of the audio data means that the output may actually contain information at lower levels than the input had - in other words, doing the bit-depth quantization right is truly important! The most common, and very simple, method is to round off the samples to nearest value in the output data format. Any Time can of course do this if you want. Unfortunately the high-frequency quantization noise introduced by this is fairly objectionable to the human brain. Any Time can also do state-of-the-art 'noise shaped dithering'. This method adds a very low level noise before rounding the sample values — i.e. instead of rounding to the nearest value, a random element is introduced. Studies have shown that this noise is much more pleasing for the brain. That is the 'noise' part. The 'shaped' part means that the added noise is very carefully filtered so that it's spectral characteristics is the inverse of the ears sensitivity curve — i.e. we get a noise that has most of its energy in the regions where the ear is the least sensitive. In other words a fairly bad type of noise is traded for a much better type of noise! The 'dithering' part means that quantization errors for one sample are carried over to the next sample. This taken together with the noise shaping has a beneficial side effect — a phenomenon called 'stochastic resonance'. This phenomenon 'subjectively' preserves some of the audio information below the lowest level that can 'truly' be represented in the output data format resulting in improved 'perceived' sound quality. If you have ever seen regular audio CD's bearing '20-bit' stickers, then it is often this phenomenon that they are referring to! (note: if you see a HDCD sticker then that is something entirely different). To explain it in rather simplified terms, the noise being truly random (i.e. 'stochastic') it every now and then will amplify ('resonate' with) a low level signal — just for a short while. That short while may be enough if it occurs repeatedly. If the ear can hear short bits of a sound sort of 'sticking out' on top of the noise, just a little bit here and a little bit there, then the brain is smart enough to 'fill in the blanks' so to speak and perceives it as a steady tone at a lower level than the noise (the perceived amplitude level of the tone is proportional to how often it resonates which statistically is proportional to real tone level). Even though the subjective sound quality is improved by this method, and 'perception' of sounds below the output bit-depth is thus possible, to actually claim '20-bit resolution' is rather misleading…
Click on one of the links below to start downloading:
- Windows 8 / 7 / Vista / XP.
- A CPU with "SSE2" support, i.e. an Intel Pentium 4 or an AMD Athlon 64 or later.
Limitations of the Trial version
- 30-day trial time limit.
- Only the first file in a batch is converted.
Any Time is commercial software marketed as Shareware.
This means that you get to "try it before you buy it".
If you find that you like it and wish to continue using it past the 30 day free trial period, then you need to buy a license.
There's also more some incentives for buying it:
- Removes the "nag screen".
- Removes the "processing length" limit.
- Removes the "max 1 file per batch" limit.
Buy it on-line here:
Payments are handled by PayPal.
Most credit cards are accepted.
EU-customers: VAT will be added the price.
License and delivery terms:
After we have received your order, you will be sent an email containing a registration code.
This is your license key that unlocks the trial version into the full version.
Please note: The code is normally sent within 48 hours, but not immediately (also, do check your "spam" or "junk" folders if you don't find it in your in-box).
What you buy is a "single user" license.
It is valid for the current program version, and for any minor version number updates (major version updates may require an upgrade fee).
You are allowed to install it on more than computer, but you are not allowed to "borrow" it to other persons. The license is personal and can not be transferred or resold.
Thank you for your order!
If everything went fine with the PayPal transaction, an email containing your reg-code and further instructions should arrive within the next 48 hours.
Please be patient, orders are manually verified before delivery. If you don't see an email, be sure to check you junk-mail folder before contacting support.