They might do, very occasionally for deep learning research, but never in connection with your name, location or other clinical details.
To teach the computer algorithms to find patterns we need to give them examples – for the research mentioned in the question above, we had to indicate which type of utterance was a greeting, setting an agenda, setting homework, being empathetic etc to see how the content affected outcomes, to ensure we used the right ingredients in sessions to achieve the best possible recovery rates.
This teaching requires a research scientist to access a limited number of randomly selected therapy sessions, extracted from the database in isolation, without the patient and therapist full names, and detached from the patient health record, or any other personal data information fields, so that they can categorise the content manually.
To teach the models, researchers accessed 0.15% of randomly selected sessions (i.e. about 3 in every 2000). Any future research involving deep learning will likely require a similar process. (Much of our research, however, is actually done on anonymised or deidentified sub sets of the NHS IAPT Minimum Data Set, very little involves researchers accessing conversations between patients and therapists)
Some research may also require deep learning techniques to be applied to the therapist notes (summary) of a session. In this case a researcher will need to tag randomly selected therapist notes, disconnected from the rest of your medical file, in the same way and approximate quantities as described above for tagging the word for word record of your therapy session.