This post includes some additional background information to support and
clarify our recent QtMobility Multimedia Blog post.
General architecture

The above block diagram gives a high-level view of multimedia in Qt.
Services
The Multimedia services classes that are in Qt 4.6 in the QtMultimedia module
form Qt’s access to the lowest level of Multimedia functionality. The Audio
classes provide a cross platform means to access audio devices, to send data to
or receive data from them, the audio data would usually be PCM, but could be in
other formats as well. The Video classes have a similar purpose, those in 4.6
exist to provide an abstraction for video outputs.
The Audio classes can be used in a self contained way when you are manipulating
or creating your audio data directly in your application. Some examples of this
would be, a sound recorder, an effect generator, VOIP client, event sounds,
games or a situation requiring intermittent or lower-latency playback.
The service classes are also meant for building other media functionality; to
simplify and provide cross platform support for the building of playback and
recording functionality.
High Level APIs
Phonon
Phonon is a high level playback API. Phonon provides facilities for playback of
media content. Phonon is also currently used in Qt for media playback. We will
continue to maintain Phonon for the duration of the 4.x series, but new
functionality will be added in the new multimedia module.
So why didn’t we try and improve Phonon? There are a lot of interconnected
reasons that would prevent Phonon from moving forward. We felt that that there
would either need to be too much compromise or major source-compatibility
breaking changes (let alone BC breakage), in order to achieve our goal of a
full featured high level framework.
Briefly, some of the problems found with Phonon were as follows -
- The MediaSource class can not adequately represent the variety of data
sources available. For example; Media sources are often not single items of
interest, but a choice of available content (not substreams of content).
Additionally it is often required to bundle additional data with the playback
content, previews, posters etc, there needs to be a way of uniformly
representing these items and tagging them appropriately
- Some playlist like functionality is bound into the media object and limits
the ability of a media subsystem to participate in playlist management. This
is vitally important for certain classes of playback service.
- Metadata support is bound to the media object, this limits the scope for
collaboration with other media service providers that may have useful metadata.
- The attempt to express a high-level media graph imposes a model of operation
on the underlying frameworks that can not necessarily be supported, the
obvious means of routing around this failure can not be made to work
reliability.
– A Qt Multimedia API needs to exist in a cross platform (hardware and
software) environment, this means the attempt at graph manipulation imposes an
added customization burden on existing Phonon backends.
- The backend is responsible for creating objects, not allowing applications to
become involved in the media conversation, this places limits on presentation
options.
- Dynamic backend changes
- Services only being available by plugins loaded by Phonon, again limiting the
media conversation.
- Non-local, out of process, playback services are difficult to implement.
QtMobility Multimedia APIs
The Mobility Multimedia APIs are high-level APIs. There is a place for
high-level APIs in a media universe, for although there are a class of
applications that require the ability to manipulate media streams at varying
levels, there is also a class of application that can happily work without
detailed knowledge of the media subsystem or the elements used in the subsystem
in order to do useful work.
The Mobility Media APIs are made to support playback, recording, playlist
management, metadata, radio, camera, and as time permits transcoding, media
editing, TV and potentially high-level stream management.
There is already support for playback, recording, radio, metadata and playlist
management and some experimental camera support.
Like all of the Mobility projects, the QtMobility Multimedia API is targeted to
merge with Qt at some future date, which has not been decided yet.
The QtMobility Multimedia API competes with Phonon on playback services.
There is little sense in having two competing media frameworks in Qt.
Future Direction
There are lots of places we would like to go, Multimedia is an exciting area to
work in. As mentioned above at the higher level we would like to see
transcoding, media editing, and TV support, we would also like to shift down
slightly and open up access to streams, without confusing the “top-level” API.
We would also like to build upon the services layer, provide a more complete
graph orientated audio only framework, as well as low level video capture
service, but only time will tell what gets implemented and when.
Further information
I hope this brings some clarity to your view of multimedia in Qt. If you have
any further questions you can always comment here, catch us on IRC (#qt-labs;
just ping multimedia, someone should pick it up) or send an email to
qt-interest at trolltech.com.
No related posts.
14 comments
Thanks for your explanation, I guess it was badly needed looking at the number of comments on the previous post. I develop a Phonon based app and I hit a few problems (not in the API itself, but in the backend implementations). I’ll happily switch to the new high level APIs when their ready. The great, for me, thing is that Qt is putting developer resources behind multimedia features. Whatever they’ll come up with, I’ll be a happy camper.
“The MediaSource class can not adequately represent the variety of data sources available. [...]”
Extending the MediaSource class is easily possible without breaking BC. “bundle addition data” should be one of the rather trivial extensions to add to MediaSource.
“Some playlist like functionality is bound into the media object…”
BTW, that was actually implemented that way because of the comments in the API review at the Oslo offices.
Anyway. Most applications I know of want to manage playlists themselves. Isn’t that one of the things that differentiates media applications? The playlist like functionality in the MediaObject makes it easy to implement queuing while keeping the control with the application. Phonon was designed with the application developer in mind and not with the media subsystem. That sometimes shows when you have to implement a backend… But from all I know (if the backend is implemented correctly) the application developer values that design decision.
What classes of playback services did you have in mind when you say that the media subsystem must participate in playlist management?
“Metadata support is bound to the media object, this limits the scope for collaboration with other media service providers that may have useful metadata.”
The meta data support in MediaObject was added mostly for radio streams where you have one source (the stream) but the meta data changes while playing. Originally meta data was supposed to stay out of the Phonon API (there are good libs already like taglib), but since this one case could not be covered by other libs Phonon had to provide this API.
If there is a need for additional API besides taglib and MediaObject providing the meta data of streams, then I don’t see the problem of adding that. Actually there were concrete plans on API how to do this. I expected Nokia to do this once they were open for feature development in Phonon again.
… enough for now. I will answer the other points some other time…
I tried once to make a VOIP client. But I could not find a way to pass the decoded audio Stream to Phonon. Is this possible at all?
Let me start from the bottom up, with the things I know:
“- Non-local, out of process, playback services are difficult to implement.” is provably false, as both the Phonon-VLC and Phonon-MPlayer backends were originally simple wrappers around the players themselves, which were running out-of-process (phonon-vlc is now using libvlc properly). What do you mean with non-local playback? Streaming audio over network? I don’t see how that can not be implemented in Phonon.
“- Services only being available by plugins loaded by Phonon, again limiting the media conversation.” What kind of services? Like the kbytestream-from-kio for playing over the network support? Why not just extend Phonon? Make a graph-element that exposes the raw data, and takes it back in again.
“- Dynamic backend changes”. Isn’t this a reason to keep Phonon? Or do you mean that it doesn’t work in Phonon? Why not just fix Phonon?
“– A Qt Multimedia API needs to exist in a cross platform (hardware and software) environment, this means the attempt at graph manipulation imposes an added customization burden on existing Phonon backends.”. So how does the new API intend to give the application developers the same degree of freedom as Phonon? This is arguably a hurdle with making new Phonon backends, but I think it is worth it, considering the freedom the application developers get.
“- The attempt to express a high-level media graph imposes a model of operation on the underlying frameworks that can not necessarily be supported, the obvious means of routing around this failure can not be made to work reliability.”. From this statement, it seems like you have not looked into the different Phonon backends that exists already. Do you have any examples of frameworks who are unable to represent the graph? If mplayer can do it, I think pretty much anyone can
Also, I would have liked some comments on how the new API intends to solve the problems you presented with Phonon. Will it not be cross-platform? Won’t it have pluggable backends? (If so, won’t you put all your eggs in one basket, so to speak?)
Also, as it seems like some (if not most) of your concerns with Phonon were unsubstantiated, why didn’t you contact the Phonon developers before starting from scratch?
And how will we know that this new API won’t be put on life-support right after it is integrated, like Phonon was?
I also wonder about the lack of API review the new QtMultimedia API seemingly has received (there’s some parameters passed as ‘QString’’s, when they seemingly should be ‘const QString &’’s, for example in QAudioFormat::setCodec). It seems like the new API is a bit rushed.
I’m sorry if I’m a bit up-front, but this blog-post didn’t really answer any of the questions I had, but rather gave me many more.
And lastly, to round it off with a anonymous quote from a developer of one major Qt multimedia application which I think sums up things quite nicely about the new QtMobility/multimedia API:
«I know one thing: they are going to seriously regret the playlist management part;
“help, there is too much application in my API”»
Thanks for the overview – it’s very helpful to get a heads up of where Qt is heading.
At the moment I’m more interested in the low level QtMultimedia API than the high level ones since I’m currently using portaudio for cross-platform device level audio, and will be glad to see Qt provide that instead to remove a dependency and hopefully add future target support.
I do have a couple of concerns with what’s been laid out though:
1) It would be a shame for video to become a lower development priority than audio and fall behind it. I don’t understand why low level video capture isn’t being provided in Qt 4.6 alongside the audio capture in QtMultimedia, and am concerned that it’s being presented as a vague future wish list item rather than a concrete deliverable for Qt 4.7.
2) While I’m agnostic on Phonon vs the new QtMobility Mutimedia API, I think it would be a shame to see Qt not take advantage of mature backend support like DirectShow and GStreamer. Would it not be possible to have multiple backends to the new API – both the existing platform provided ones, and additionally a Qt built one based atop QtMultimedia both for those platforms that need it, and as an alternative for those that don’t?
“Phonon-VLC and Phonon-MPlayer backends were originally simple wrappers around the players themselves”
phonon-vlc uses libvlc since the very first line of code. phonon-mplayer on the other hand uses mplayer.exe through a QProcess. I know because I wrote them
[deleted personal attack]
@tanguy_k: heh, ok. I thought the GSoC for VideoLan last year was for porting it to libvlc, but I never checked it.
In response to spinynorman comments,
1) we have been doing alot of research into low-level video but it is alot more complex than audio. It has not been forgotten at all, I see it as being very important. But it must be right too, when something that devs here are happy with is semi-concrete, it will be added to the roadmap and planned for a release.
2) we have a team of devs working full-time on making the multimedia framework the best it can be but it is not done yet or it would in the release. The focus is on the API and getting it right when this is done we can focus more efforts on backend implementations. We have done a backend audiocapture for the mobility that uses the QtMultimedia audio class’s for recording, I will have a look when things aren’t so crazy to see about adding playback functionality.
What about scanning?
I know it’s may not be the core of multimedia, but it’s still part of it…
A cross-platform scanning solution would be great to have (along with video capturing).
Phonon in Qt 4.5 doesn’t support webcam capture and video stream saving. We need this feature for our application and I had to hack around to solve this issue. Apparently The Qt Multimedia announced for Qt 4.6 only support audio input and recording, and thus wont apparently support webcam capture. Is hope to reconsider this reasonable ?
I’m writing a Qt application to control an X Ray scanner equipped with a webcam. This feature is really missing. I don’t care if I have to use Phonon or another module. All I need is the functionality with a good documentation and integration into Qt. Why can’t we have this with Phonon which is around for some time now ?
Qt Multimedia doesn’t support webcam recording, but Qt Mobility multimedia is going to support recording (including webcams).
“Qt Multimedia doesn’t support webcam recording, but Qt Mobility multimedia is going to support recording (including webcams).”
[sarcasm] That makes a lot of sense [/sarcasm]
While I’ve followed Qt since 2.x, this whole Multimedia/Mobility project just well, sucks. It seems a departure from what Qt has been doing for years. I do see the need for something to fill the gap between Qt and all the new devices (things that aren’t disk, memory, screen or network – like video and GPS) but something is seriously wrong here. I think the proper path is to continue Phonon for now and ever, have a Multimedia abstraction above that, and above that, if needed, mobility. However I doubt anyone really needs anything above mobility once proper hardware devices have the proper Qt-esque API support. What they need is a generic programming convention. I think this should come from an API style and not another library. The QDBusAbstractInterface stuff should serve as a template for video, GPS, etc. While it may not be on DBUS proper, the “QtBUS” should collect these multimedia and mobility services.
Here I was searching for some help with Qt and GStreamer and I stumble upon this article… Sounds like those commenting here are more knowledgeable about multimedia than those I’ve spoken to at Qt support.
This is indeed alarming since our goal was to develop a Qt app that plays multimedia using some codec plug-ins for GStreamer written by TI to access hardware acceleration through the use of their DSP. My research has led me to believe that I could not use Phonon, but instead had to code using GStreamer directly to set up my pipeline (which appears to work now).
So the question I’m trying to answer is whether or not there is some mechanism to “widgetize” the GStreamer playback like Phonon seemed to allow (although I’m a new Qt AND GStreamer developer now). I want to be able to resize, relocate, etc like I would be able to do with Phonon using standard codec playback.
I’m wondering if the new direction that Qt seems to be taking with Phonon will end up helping or hurting my effort. If anyone knows of a resource to point me towards I would be very grateful!
Comments on this entry are closed.