How we create a secure process
In the following, we describe in detail how we have technically designed the speech recognition process to be secure and transparent.
The procedural concept was submitted to the Hessian Data Protection Commissioner. Since then, we have added further security measures, e.g. a so-called AppAmor that only allows processes on our servers that are required for speech recognition. Data protection regulation of the automatic transcription of audio files
I. Initial Situation / Brief Description of the Company
dr. dresing & pehl GmbH, Deutschhausstraße 22A, 35037 Marburg, (hereinafter also referred to as: “we”) has been selling software under the brand name “audiotranscription” since 2005 as well as (as an optional supplement to this) so-called “foot switches” for the manual transcription of interviews. The software for transcription is mainly used in the university context and is a central component of qualitative method training there.
In the following, we therefore describe our contractual and technical measures to make the automatic speech recognition compliant with data protection. This is the basis on which the transcription service will be offered to customers in the future.
II. Clarification of terms
In the following, a distinction is first made between:
“contract data”, i.e. data of the person using the service (hereinafter “you”) (i.e. name, address, etc.) and “order data”, i.e. audio files and the corresponding text files uploaded by the person using the service to order our services as well as the respective transcribed data. These files may contain personal voice data of the user as well as voice data of third parties (the recorded persons).
III. Information of customers according to Art. 13 GDPR
Before or when creating a customer account, you will be informed in detail about data processing in accordance with Art. 13 GDPR
IV. Overview of the workflow
You will upload interviews or other language files to a server of dr. dresing & pehl GmbH via a software client (f4transcript or a web client). There, the language files are automatically converted into a text. The generated text is displayed in the software client and can be further processed there locally. All data uploaded to the server is deleted after transcription and transfer to the software client. The individual steps of this process are explained in more detail below:
1. software installation/registration
The prerequisite for using the service is the installation of the f4transcript software or a corresponding web client. Here you must register personally before using the speech recognition service. A corresponding dialogue will be displayed by the software to be installed locally. Your registration takes place in the following steps:
Step 1: Assigning a user name and password.
In the first step, you will be asked for your e-mail address, a password of your choice and confirmation that you have read and understood the data protection declaration (which can be viewed via a link). The password to be chosen must meet certain minimum requirements (a combination of upper/lower case, special characters, numbers, at least 10 characters).
The password can be changed after authentication by e-mail. When the e-mail address is entered in the corresponding field of the client, a code is sent to the e-mail address stored. Only after entering this code can the password be changed.
Step 2: Confirmation of registration
To verify the specified e-mail address, a code is sent to the deposited account. Only when the customer has entered this code via the login dialogue in the client will the account be activated. Unconfirmed data will be deleted after 24 hours.
Step 3: Conclusion of a contract on commissioned processing (ADV)
After confirming the registration, you will receive a dialogue for concluding an ADV. Here, the text of the contract including a list of technical and organisational measures and sub-processors (server hoster) are listed. You have the option of entering the purpose of the processing and the type of personal data to be processed separately. The text of the contract will be sent by e-mail after confirmation by you (conclusion of contract pursuant to Art. 28 (9) DSGVO).
Step 3a (optional): Obligation to maintain confidentiality in accordance with Section 203 of the German Penal Code (StGB).
Some groups of persons (e.g. in legal or medical activities) are subject to special provisions on confidentiality according to § 203 StGB. In order to enable the processing of data, it is necessary in these cases that we explicitly commit ourselves and subcontractors (beyond the provisions of the ADV) to secrecy in accordance with § 203 StGB. Upon request, you will optionally receive a corresponding commitment in electronic form.
2 Activation for the upload of order data
The account will only be activated for the upload of order data to our server after the registration has been completed. The registration information is stored on the voice recognition server and is physically and logically separated from billing data (see section VII. Data processing infrastructure).
3. purchase of time quotas via the online shop
The use of automated speech recognition is made possible on the basis of time quotas. The time quotas can be purchased in advance in the form of credit codes via our online shop. These codes are generated by our activation server (logically and physically separated from the speech recognition server) and sent by e-mail. The codes are not personal and can be used by any (but registered) person to top up their own time quota.
V. Processing of individual orders
As “processing of individual orders” we describe here the upload of an audio file to our server, the processing there and the download of the finished results until the deletion of the individual order data. The order data is only stored on the server for as long as is necessary for the purposes of the processing. Afterwards, the order data will be restored to your computer and stored there locally by you.
1. uploading audio files
Audio files can be uploaded to our server if you are registered and logged in to a client. The client generates an asymmetric key password for each audio file during the upload. The public key is sent to the server together with the audio file during the upload (job key). The private key is encrypted and stored on the client computer with your secret password when using f4transcript. This ensures that the job data can only be decrypted from the registered client. When using f4x via the browser, this password is stored in encrypted form on a separate server (separate from the speech recognition).
Uploading to our server is done via a secure connection. File names are pseudonymised by random but unique names before processing. When using f4transcript already during upload.
For processing, the audio file is decoded by the speech recognition algorithm and converted into a text file. The audio file is deleted immediately after successful conversion into a text file. The finished text file is encrypted with the job’s public key and stored temporarily on the server for retrieval.
The server reports a status to the client for each job. Successfully converted jobs report the status to the client and activate the “Download” button there.
The finished text files can be downloaded from the client. After successful download, the text file is decrypted by the private key on the client. When using f4transcript, the combination of public and private key ensures that the results can only be decrypted on the computer from which the job was uploaded. When using f4x via the browser, the result can only be decrypted with correct credentials.
As soon as the server receives the message about the successful download, the file will be permanently deleted from the server.
If an error should have occurred during the upload, e.g. due to an unrecognised file format or the termination of the connection, the incomplete audio file is immediately deleted from the server. The client then receives a corresponding message
If a result is not collected after 14 days, you will receive a notice by e-mail. If this notice remains unanswered, you will receive another reminder after 7 days. Should the 7-day deadline for collection stated therein expire, the order data will be deleted and the client will be informed of this by e-mail.
VI. Duration of data storage and data deletion
With regard to the duration of data storage and data deletion, a differentiation is to be made as follows:
Contract data are initially stored permanently on the voice recognition server for legitimisation and order control. The deletion of the contract data takes place when the account is deleted, provided that no contractual and/or legal retention periods prevent the deletion. Order data, i.e. the audio files and the corresponding text files, are stored for the duration of the processing until they are downloaded by you or until the agreed deletion period has expired and are then automatically deleted. Supplementary information on the order data, such as file size and date of upload, is stored to enable the processing and invoicing of the individual orders and to document them. This data is stored for traceability by you and documentation of possible claims for as long as the account is active. When the account is deleted, the data is deleted. Order data when purchasing time quotas (e.g. the name, address, e-mail address, telephone number (optional), date of order and number of items ordered) are uploaded to the webshop server and to our in-house server in Marburg for billing and accounting purposes and stored in accordance with legal retention periods.
Detailed information on the exact data, processing purposes and storage periods is provided in the data protection declaration.
VII. Data processing infrastructure
The data processing infrastructure used is divided into four physically independent areas. You will be informed about the infrastructure used by the TOMs in the annex to the ADV. In detail:
1. speech recognition server
The “speech recognition server” contains the speech recognition algorithm and manages the order processing and user administration. This is where the order data is temporarily stored during processing. This data is processed on a dedicated root server of Hetzner Online GmbH in Nuremberg or Falkenstein.
The computer centre is DIN-ISO/IEC-27001 certified (German accreditation body D-ZM-18855-01-00, certificate number ZN-2016-04). An order processing contract was concluded on 29.10.2018.
The webshop for the purchase of time quotas and e-mail services run via a server of ALL-INKL.COM Neue Medien Münnich with server locations in Dresden and Friedersdorf. The address data provided by you, the articles purchased and the correspondence by e-mail are stored here. A contract for order processing was concluded on 25.05.2018.
3. internal order processing
For billing and accounting purposes, customer data is stored on our own servers at the business premises of dr. dresing & pehl GmbH in Marburg and archived in accordance with statutory retention periods. Access to the data is regulated in particular by an access concept (password, restrictive assignment of rights, etc.).
4. payment processing
We do not store data for credit card payments or direct debit orders. The processing of these payment methods is forwarded directly to the payment service provider BS PAYONE GmbH in Frankfurt am Main via so-called iframes.
Payments by PayPal are made by you directly on the payment page of PayPal (for European customers PayPal (Europe) S.à r.l. et Cie, S.C.A., in Luxembourg).
5. server infrastructure
The communication between clients and the server for automatic speech recognition takes place via a REST API provided by the server. SSL/TLS 1.2 is used for transport encryption. The server for automatic speech recognition is located in an ISO-certified data centre in Germany.
Authentication is carried out for each request using basic authentication (user name/password). The password of the user is stored in the database of the speech recognition server as a bcrypt hash (salt 128 bit). The password must follow our password guidelines (minimum 10 characters, minimum one lower case and one upper case letter, one number and one special character).
Clients can be f4transcript for macOS and f4xWeb (http://f4x.audiotranskription.de ). f4xWeb is provided on a server in Hetzner’s data centre. Communication with f4xWeb takes place via a web browser.
The media file is unencrypted on the speech recognition server until the end of speech recognition and is deleted immediately after recognition is completed.
The transcript is created after the recognition is completed and asymmetrically encrypted using ECC curve secp256k1. A separate key pair is generated for each job. In the case of f4transcript for macOS, the key remains with the client on the client computer and is never on one of our servers. Thus, the transcript cannot be decrypted by us.
In the case of f4xWeb, the private key is stored on the server for f4xWeb in a keyring. All private keys of all jobs of a user are stored in a separate keyring. The keyring is symmetrically encrypted with the client’s password (AES 256). When a transcript is retrieved via f4xWeb, the private key of the transcript is retrieved from the keyring using the password determined for authentication and transferred to the speech recognition server for decryption. The decrypted transcript can then be downloaded by the client via f4xWeb. A transcript decrypted in this way is automatically deleted after 60 seconds at the latest.
After successful retrieval, the encrypted transcript remains on the speech recognition server for approx. one hour and is then automatically deleted.
Status: 20.05.2019, minor corrections 13.02.2023