Here are some tips about Qt and Embedded software projects that I gathered over the years while working on Qt and as a consultant for Qt projects. Some of them are less surprising than others, and some might be rather subjective:
Choose the right license for Qt, i.e. whether you want to buy licenses from The Qt Company or whether you want to go with LGPL. There is a nice presentation from Burkhard Stubert about this.
When choosing hardware: Always choose a GPU whose OpenGL driver is well supported. Also, when using QML, choose a system with enough RAM; I would say 1 GB is the minimum here. Boot to Qt might be a good helper as well.
Writing a Qt app with one fullscreen HMI using Qt's "eglfs" plugin is a safer choice than using a windowing system, and the only things that can get in your way is the GPU or the OpenGL driver.
Writing a multi-window HMI with Wayland can be tricky with NVidia cards. Wayland with Intel graphics cards is the safer way to go.
When building Qt, you can improve performance with options like -openssl-linked, -fontconfig (should be detected automatically when fontconfig is installed) and others.
When using Qt built with fontconfig, make sure to not use QML FontLoader elements with big fonts (e.g. fonts with Asian characters). Last time I checked FontLoader read the font files synchronously and kept the whole file in memory, which for big fonts might amount up to 10 MB.
Load QML files as resource files instead of reading them from disk.
My personal rule of thumb: When starting to cross compile, it is a good time to introduce a CI system to check that the build is still working on all architectures.
When using Loader objects in QML: Make sure the code loaded is fast enough; you might want to use the QtQuick Compiler (see next item) or check out a "chained loading approach": First load the main UI, when it has loaded ("Component.onCompleted"), load the code the user might click on. A separate blog post will hopefully follow soon.
Use the Qt Quick Compiler to speed up performance on target hardware, especially at startup and when using Loader objects. Startup can be improved by 30%, as this example shows, which is in line with a benchmark of one of my current projects.
As written elsewhere already: Keep your QML layer as thin as possible, avoid JavaScript, and if you can write something in C++, do so.
When doing UI testing with Squish, start writing the tests when the UI is pretty stable. Otherwise you might spend a lot of time rewriting Squish tests when the designers change the look of the UI.
Always have an option to build and run your code with address sanitizer and other sanitizers. If you don't have a build option, you might just want to do it manually from time to time.
When on Windows: Using sanitizers might be a good reason to use clang instead of MSVC.
If cross platform is important, e.g. your desktop client is running on Windows and your target client on Linux, then clang and cmake seem the natural way to go.
Static code checking might be helpful, if you are willing to dig through the massive amount of false positives. I personally think sanitizers are way more helpful though.
Only use git submodules (or SVN externals) when you have to, e.g. if you include a project which has a different release schedule and might change API or so.
Try your app on target hardware as early as possible. If there is no target hardware available, you might want to try it on similar hardware to check whether compilation (e.g. on ARM) and deployment works and the performance is acceptable. Some apps behave perfectly fine on desktop and turn out to be too slow on an embedded device.
Create new threads only if necessary, e.g. if an API you want to call takes some time and is synchronous, or if there is another event loop. You should always be in control of the number of threads on your system. Don't tell fellow developers to "just put your stuff into a thread"; this doesn't scale.
Only use Qt Plugins (QQmlExtensionPlugin instances) if you have to; resolving symbols at runtime might be expensive. Creating a normal library instead of a plugin is faster.
If you are using QML and are accessing the network, there is no need to create a QNetworkAccessManager: You can reuse the one from "QQuickView::engine()->networkAccessManager()". This allows you to use cached DNS records, open sockets, TLS sessions etc. and can speed up network requests. For more information feel free to checkout one of my talks about QtNetwork performance.
Do not blindly call QNetworkReply::ignoreSslErrors() or QSslSocket::ignoreSslErrors(). If in doubt whether you are doing it right, do not call those methods at all.
On Linux, you might want to try set the environment variable QSG_RENDER_LOOP to "threaded", if your hardware doesn't do that already. Qt is taking a conservative approach about when to use the threaded render loop, so maybe your configuration is not eligible for threaded rendering by default, even though it is working. For more information, see https://doc.qt.io/qt-5/qtquick-visualcanvas-scenegraph.html#scene-graph-and-rendering.
Use qmllint to check your QML files for syntactic errors, either manually from time to time or in your CI system.
When benchmarking code on Linux, you can simulate a cold system start by deleting the file system caches via "sync; echo 3 > /proc/sys/vm/drop_caches". Otherwise your second and third app runs will be faster than the first one.
When measuring memory usage on Linux, make sure to measure only the memory that is actually loaded, e.g. via this cryptic line: "grep '^Pss:' /proc/pidof MyApp
/smaps | rev | cut -d ' ' -f 2 | rev | awk '{total+=$1} END{print total}'". Bottom line: Use the "Pss" or "Rss" field of smaps to measure memory.