Data protection design of the automatic transcription of audio files
I. Initial situation / brief description of the company
We are dr. dresing & pehl GmbH, Deutschhausstraße 22A, 35037 Marburg, Germany, and have been distributing software under the brand name “audiotranskription” since 2005, as well as optional foot controls for the manual transcription of interviews. The software is mainly used in a university context, where it is a central component of qualitative methods training.
In the following, we describe our contractual and technical measures to ensure that automatic speech recognition complies with data protection regulations.
II. Important terms
For a better understanding, we distinguish between two key terms:
- Contract data: Your personal information such as name and address. Everything we need for ordering, payment and delivery.
- Order data: Your audio and text files and the transcripts created from them. This data usually also contains personal data of third parties (the interview partners)
III. you can find out about your rights here:
We take data protection seriously and comply with the General Data Protection Regulation (GDPR).
You will be informed in detail about data processing before or when creating a customer account in accordance with Art. 13 GDPR, where there is also a prominent link to the privacy policy. This can also be found directly here.
IV. Overview of automatic transcription
Interviews or other voice files can be uploaded to our server via the f4 software or a web client. There, the language files are automatically converted into text.
After conversion, you can download and edit the generated text directly. The data is then only on your own computer again. To protect your data, we delete all information from our server as soon as the transcription has been completed and sent to you.
How exactly does it all work? We will explain this step by step in the following sections.
1. software installation/registration
To use our service, you need either f4 on your computer or our web client. Personal registration is required to get started:
The Register button will guide you through the registration process. In both cases, only a few steps are necessary:
Step 1: Create user account
All you need to create your account is your e-mail address and a secure password. (a combination of upper/lower case, special characters, numbers, at least 10 characters).
If you forget your password, you can reset it at any time. We will send you a security code to your registered e-mail address.
In the registration form we also ask you to agree to our privacy policy. You can find them via a link provided directly in the form or here.
Step 2: Confirmation of registration
After registration you will receive an e-mail with a confirmation code. Enter this code in the software to activate your account.
Please note: For security reasons, we will have to delete your data if you do not enter the code within 24 hours.
Step 3: Conclusion of a data processing agreement (DPA)
In the next step, we present you with the order processing contract (AVV). This contract is essential for anyone who processes data from third parties, e.g. wants to evaluate interviews. The contract is required under the GDPR.
This documents in detail how we ensure data protection in technical and organizational terms.
To ensure that everything is GDPR-compliant, please specify the exact purpose of the data processing and the type of personal data to be processed. After your approval, we will send you the contract by e-mail. This means that the conclusion of the contract pursuant to Art. 28 para. 9 GDPR and if your project is ever audited by data protection officers, you can prove that you meet the legal requirements.
Once you have completed these steps, your registration is complete and you can start using our automatic transcription service.
Step 3a (optional): Obligation to maintain confidentiality in accordance with § 203 StGB
For professional groups with special confidentiality obligations (e.g. in the legal or medical field), we offer an additional confidentiality agreement on request. This ensures compliance with legal requirements.
2. activation for the upload of order data
After completing the registration, the account is activated for uploading audio files to our server. We store your registration data securely on our speech recognition server. This information is kept strictly separate from your billing data in the store – both physically and in the data structure. You can find more details on the technical implementation in the section “Infrastructure for data processing” under point VII.
3. purchase of time quotas via the online store
To use our automatic transcription, you work with time quotas. You can easily purchase these as credit codes in our online store. Here are a few important points: You will receive the codes conveniently by e-mail. The codes are not tied to you personally. This means you can use them flexibly or even pass them on. The only important thing is that the user is registered with us.
When you shop with us, we store certain information: Name, address, e-mail address, telephone number, order date and items purchased. We need this data for our accounting and billing. We store them on the server of our web store and on our own server in Marburg. In doing so, we comply with the statutory retention periods.
We treat your payment data, especially credit card information, with the utmost care: we do not store any credit card data ourselves. We process payment via secure systems (so-called iframes) or direct payment pages from Stripe and PayPal.
V. Processing of individual orders
This is how an individual order (i.e. the transcription of an interview) works for us
When you use our service, your order goes through several steps:
- You upload your audio file to our server.
- We process the file on our server.
- You download the finished result.
- We delete the order data from our server.
Important for you to know: We only store your order data on our server for as long as is absolutely necessary for processing. As soon as you have downloaded the result, the data will be transferred to your computer. You can then save them locally and continue working with them.
1. uploading audio files
Audio files can be uploaded to our server if you are registered and logged in to a client. The client generates an asymmetric key password for each audio file during upload. The public key is sent to the server together with the audio file during the upload (job key). When using f4transkript, the private key is encrypted with your secret password and stored on the client computer. This ensures that the order data can only be decrypted from the registered client. When using f4x via the browser, this password is stored in encrypted form on a separate server (separate from the speech recognition).
The upload to our server takes place via a secure connection. File names are pseudonymized with random but unique names before processing. When using f4 already during the upload.
2. editing
For processing, the audio file is decoded by the speech recognition algorithm and converted into a text file. The audio file is deleted immediately after successful conversion to a text file. The finished text file is encrypted with the job’s public key and stored temporarily on the server for retrieval.
The server reports a status to the client for each job. Successfully implemented jobs report the status to the client and activate the “Download” button there.
3. download
The finished text files can be downloaded from the client. After a successful download, the text file is decrypted by the private key on the client. When using f, the combination of public and private key ensures that the results can only be decrypted on the computer from which the job was uploaded. When used via the browser, the result can only be decrypted with correct login data.
4. deletion
As soon as the server receives the message about the successful download, the file is permanently deleted from the server.
If an error occurs during the upload, e.g. because a file format is not recognized or the connection is interrupted, the incomplete audio file is immediately deleted from the server. The client then receives a corresponding message
If a result is not collected after 14 days, we will send a notification by e-mail. If this notice remains unanswered, there will be another reminder after 7 days. If the 7-day collection period specified therein expires, the order data will be deleted and we will inform you of this by e-mail.
VI Duration of data storage and data deletion
With regard to the duration of data storage and data erasure, a distinction must be made as follows:
Contract data is initially stored permanently on the voice recognition server for legitimization and order control. The contract data will be deleted when the account is deleted, provided that there are no contractual and/or statutory retention periods to prevent deletion. Order data, i.e. the audio files and the corresponding text files, are stored for the duration of processing until they are downloaded by you or until the agreed deletion period has expired and are then automatically deleted. Additional information on the order data, such as file size and date of upload, is stored to enable the processing and invoicing of individual orders and to document these. This data is stored for as long as the account is active for the purpose of traceability by you and the documentation of possible claims. The data will be deleted when the account is deleted. Order data when purchasing time quotas (e.g. name, address, e-mail address, telephone number (optional), date of order and number of items ordered) are uploaded to the webshop server and to our in-house server in Marburg for billing and accounting purposes and stored in accordance with statutory retention periods.
Detailed information on the exact data, processing purposes and storage periods is provided in the privacy policy.
VII. Infrastructure for data processing
The infrastructure used for data processing is divided into four physically independent areas. You will be informed about the infrastructure used by the TOMs in the appendix to the GCU. In detail:
1. speech recognition server
The “speech recognition server” contains the speech recognition algorithm and manages order processing and user administration. The order data is temporarily stored here during processing. This data is processed on a dedicated root server of Hetzner Online GmbH in Nuremberg or Falkenstein.
The data center is DIN-ISO/IEC-27001-certified (German accreditation body D-ZM-18855-01-00, certificate number ZN-2016-04). A contract for order processing was concluded on 29.10.2018.
2. webshop
The webshop for the purchase of time quotas and e-mail services run via a server of ALL-INKL.COM Neue Medien Münnich with server locations in Dresden and Friedersdorf. The address data provided by you, the items purchased and correspondence by e-mail are stored here. A contract for order processing was concluded on 25.05.2018.
3. internal order processing
For billing and accounting purposes, customer data is stored on our own servers at the offices of dr. dresing & pehl GmbH in Marburg and archived in accordance with statutory retention periods. Access to the data is regulated in particular by an access concept (password, restrictive assignment of rights, etc.).
4. payment processing
We do not store data for credit card payments or direct debit orders. The processing of these payment methods is forwarded directly to the payment service provider BS PAYONE GmbH in Frankfurt am Main via so-called iframes.
Payments via PayPal are made by you directly on the payment page of PayPal (for European customers PayPal (Europe) S.à r.l. et Cie, S.C.A., in Luxembourg).
5. server infrastructure
Communication between clients and the server for automatic speech recognition takes place via a REST API provided by the server. SSL/TLS 1.2 is used for transport encryption. The server for automatic speech recognition is located in an ISO-certified data center in Germany.
Authentication is carried out for each request using Basic Authentication (user name / password). The user’s password is stored in the voice recognition server’s database as a bcrypt hash (salt 128 bit). The password must follow our password guidelines (minimum 10 characters, at least one lowercase and one uppercase letter, one number and one special character).
The clients can be f4 or a browser. The browser application is made available on a server in Hetzner’s data center.
The media file remains unencrypted on the speech recognition server until the end of speech recognition and is deleted immediately after recognition is complete.
The transcript is created after recognition is complete and asymmetrically encrypted using ECC curve secp256k1. A separate key pair is generated for each order. When using f4, the key remains locally on the client computer and is never stored on one of our servers. This means that the transcript cannot be decoded by us.
When used via browser, the private key is stored on the server in a keyring. All private keys for all of a user’s orders are stored in a separate keyring. The keyring is symmetrically encrypted with the customer’s password (AES 256). When a transcript is retrieved via the browser, the private key of the transcript is retrieved from the keyring using the password determined for authentication and transferred to the server for speech recognition for decryption. The decrypted transcript can then be downloaded. A transcript decrypted in this way is automatically deleted after 60 seconds at the latest.
After successful retrieval, the encrypted transcript remains on the server for approximately one hour for speech recognition and is then automatically deleted.
Status: 10.09.2024