Audio Feature Extraction
The Audio Feature Extraction API allows users to upload an audio file and extract detailed audio features such as acousticness, danceability, energy, and more. This API is useful for music analysis, recommendation systems, and audio processing applications.
Request and Response
Extracts a variety of audio features from an uploaded audio file.
Request
- Method:
POST
- URL:
/v1/analysis/audio-features
- Full URL:
https://api.reccobeats.com/v1/analysis/audio-features
Headers
Name | Type | Required | Description |
---|---|---|---|
Content-Type | String | Yes | Must be multipart/form-data for file uploads |
Body Parameters
Name | Type | Required | Description |
---|---|---|---|
audioFile | Binary | Yes | The audio file to upload. Maximum file size: 5MB |
Supported Audio Formats
- MP3
- OGG
- Vorbis
- AIFF/AIFC
- WAV
Response
Success Response
- Status Code:
200 OK
- Content-Type:
application/json
Response Body
{
"acousticness": 0.174,
"danceability": 0.4004,
"energy": 0.6899,
"instrumentalness": 0.0309,
"liveness": 0.1188,
"loudness": -6.0411,
"speechiness": 0.0566,
"tempo": 147.7849,
"valence": 0.2747
}
Field | Type | Description |
---|---|---|
acousticness | Float | Confidence (0.0 to 1.0) that the track is acoustic. Higher values indicate more natural sounds. |
danceability | Float | Suitability for dancing (0.0 to 1.0). Higher values indicate more rhythmically engaging tracks. |
energy | Float | Intensity and liveliness (0.0 to 1.0). Higher values indicate more energetic tracks. |
instrumentalness | Float | Likelihood of no vocals (0.0 to 1.0). Values above 0.5 suggest instrumental tracks. |
liveness | Float | Probability of a live audience (0.0 to 1.0). Values above 0.8 strongly suggest a live track. |
loudness | Float | Average loudness in decibels (dB). Typically ranges between -60 and 0 dB. |
speechiness | Float | Presence of spoken words (0.0 to 1.0). Values above 0.66 indicate mostly speech. |
tempo | Float | Estimated tempo in beats per minute (BPM). Typically ranges between 0 and 250. |
valence | Float | Emotional tone (0.0 to 1.0). Higher values indicate a happier mood, lower values a sadder one. |
Error Response
- Status Code:
400 Bad Request
- Content-Type:
application/json
Response Body
{
"timestamp": "2025-03-26T17:06:19.870+00:00",
"error": "Unsupported file type",
"path": "uri=/v1/analysis/audio-features",
"status": 4002
}
Field | Type | Description |
---|---|---|
error | String | Short error description |
message | String | Detailed error explanation |
Example
Request Example (cURL)
curl -X POST http://localhost:3000/v1/analysis/audio-features \
-H "Content-Type: multipart/form-data" \
-F "[email protected]"
Response Example
{
"acousticness": 0.3287,
"danceability": 0.4065,
"energy": 0.6529,
"instrumentalness": 0.0269,
"liveness": 0.131,
"loudness": -6.5329,
"speechiness": 0.0493,
"tempo": 140.7761,
"valence": 0.3396
}
Constraints
- Maximum File Size: 5MB
- Supported Formats: MP3, OGG, Vorbis, AIFF/AIFC, WAV
- Maximum audio length: 30 seconds (audio beyond 30 seconds will be truncated)
info
To analyze audio longer than 30 seconds, you can divide it into multiple files, extract features from each, and then compute the average of the extracted values. The recommended audio sampling rate is 44,100 Hz, however, the model is designed to support multiple sampling rates.
Status Codes
Code | Description |
---|---|
200 | Successful response |
400 | Bad request (e.g., Unsupported file type) |
415 | Bad request (e.g., Content-Type 'null' is not supported.) |