Dataflows, Power Platform, Power Platform Dataflows

ALM for Power Platform Dataflows – The story begins

Today I was surprised and very happy when I found that we are now able to create a Power Platform Dataflow FROM WITHIN A SOLUTION! This is a game changer and a major improvement for Dataflows! From zero ALM to being able to manage Dataflows in solutions. I just had to give it a try. So many questions popped up into my mind. Are we now able to export Dataflows from one environment and import to another? Can we now use the alternative Load to New Entity and choose a prefix for the entity and the fields, i.e. the prefix of the Publisher? Has the ALM journey for Dataflows finally began? This blog post contains my findings.

Creating a Power Platform Dataflow from within a solution

I will assume that you now are familiar with Maker Portal, which is the place where you create your Dataflows and work with your solutions. The first thing I did was to go and create a new solution. I created a solution Dataflows and gave it a try creating a new Dataflow from within this solution. I waited with excitement and was eager to see the result.

Microsoft has some work to do here it seams. The Dataflow did not end up in the Solution at all, but I could see it under Data – Dataflows. The prefix for the created entity and the fields was just as ugly as before, i.e. not the Publisher prefix. Oh well, moving on to adding an existing Dataflow to a solution.

Adding an existing Power Platform Dataflow to a solution

I simply went to my solution again and this time I picked up the Dataflow that I created before. As already mentioned, the Dataflow that I created from within a solution did not end up in the solution but it could be found under Data – Dataflows, which means it had been created. So I chose to add it to my solution.

Add an existing Dataflow to a solution
Under Outside solutions the Dataflows can be found
The Dataflow is now located within a solution

It worked fine and now I had the Dataflow in my solution. It was time to export the solution and see if it could be imported into another environment. Before the export I and published the solution ran the Solution Checker.

Before you export a solution – do not forget to publish and run the solution checker!

I then exported as Managed. So far so good!

Importing a solution containing a Power Platform Dataflow

I will just assume that you are familiar with the process of importing solutions. The import went fine. It was to another tenant where I have a trial environment. Looking at the imported solution it also contained an Entity with the display name Dataflow. Interesting!

The solution was imported without any errors

Then I wanted to try and schedule the Dataflow and also to run it. The first problem I ran into was that a connection was missing.

We cannot run the Dataflow until we have fixed the connection

I went to Data – Dataflows and opened the Dataflow. It then let me configure a connection. However I assume that this adds an unmanaged layer to our solution which we rather not have.

Opening the Dataflow in the target environment lets us configure a connection

I configured the connection, closed the Dataflow and the … again and Refresh. The next problem I ran into was that my Dataflow was configured to load data to a new entity I assumed (more tests showed me that I had to redo the mapping, even though I had a Dataflow which was configure to load to an existing entity).

The Dataflow could not figure out how to create an entity in the target environment

Since it does not work well with loading to new entity and since it is not recommended anyway (see Conclusions in the end of this blog post), I went back to my source environment, I created a new entity from scratch and a new Dataflow and let the Dataflow load to this new entity. Both the entity and the Dataflow was created from within my solution. Also this time I had to add the Dataflow manually to the solution since it did not end up there when I created it.

In my target environment I first deleted the old solution and I imported this new one. The import went fine again. I went to Data – Dataflows, … next to the imported Dataflow and chose Edit. I configured the connection. Then I closed it and chose … next to the Dataflow and Refresh. Once again I got this error.

I opened the Dataflow from Data – Dataflow, … Edit and on the mappings page I noticed it said Load to new entity. I chose Load to existing and chose the Plant entity and I had to do the mapping all over again. After that I was able to Refresh the Dataflow and I got Plants from the Trefle API into my target environment!

Data was loaded into the target environment with an imported Dataflow

The last thing I tested was to split my solutions into two. One for the entity and one for the Dataflow. I first imported the solution with the entity and then the one with the Dataflow. I went and configured the connection and clicked Refresh. Same error again – it could not fin the entity and I had to redo the mapping.

Oh well, for now – let us be happy about the fact that we actually can move Dataflows between environments. Hopefully in the future we will be able to do this without adding an unmanaged layer to our solution in the target environment.

Conclusions

Working with Solutions in Maker Portal has become fun. No need to switch to classic mode at least for the actions made throughout this blog post. I really like it that there is a solution checker and I do think that the performance for when adding/removing components is really great. Performance for exporting and importing, well, we cannot get it all can we.

Even though we are able to create a new Dataflow from within a solution (I am so happy to see this) the Dataflow does not end up in the solution (I am sure Microsoft will fix this).

We are now able to add a Dataflow to an existing Solution, export the Solution and import it to another environment. however we need to redo the mapping, at least from my experience and for how I tested.

You will need to open the Dataflow and fix the connection to the Data Source. This probably adds an unmanaged layer to your solution (same for changing the mapping I assume). I did try the Solution Layer feature but it was quite empty in there. I just assume that you get an unmanaged layer when you open the Dataflow and fix the connection, since you then get to see the Dataflow and the Data Transformation steps in Power Query. Similar to how you get to see a Power Automate flow when you have imported a managed Solution with a flow and you fix the connection.

Importing a Dataflow which is configure to Load to New does not run correctly. However – do not use Load to New Entity at all anyway – you will get ugly prefixes.

Deleting a managed Solution containing a Dataflow does not actually delete the Dataflow from the environment. But after you have deleted the Solution the Dataflow can be deleted.

In your source environment – If you try to delete a Dataflow from a solution (not just remove but delete from the entire environment) you will get an error message. You can go to Data – Dataflows and delete it though, it will then be deleted from the environment and with that of course also deleted from your solution.

Would be interesting to also try out the scheduling part. What if the Dataflow is scheduled, will that setting move over to the target environment? I have not given this a try. But I hope it is consistent with Power Automate flows.

By the way – There is a new article about Dataflows which was added to the Microsoft documentation recently and which I really recommend. It actually says in there that it is NOT recommended to Load to new entity. https://docs.microsoft.com/en-us/powerapps/developer/common-data-service/cds-odata-dataflows-migration

I am so happy to see that the ALM story for Power Platform Dataflows finally has begun and I cannot wait to see what improvements we will get in the (hopefully near) future and hopefully along with new documentation about best practices.

10 thoughts on “ALM for Power Platform Dataflows – The story begins”

  1. Hi Carina, thank you so much for the post, very interesting stuff. May I ask a (probably) stupid question? Is there a clear answer on when to use connectors and when to use dataflows to manipulate data from other sources?

    Like

    1. Hi Daniel! There’s no such thing as a stupid question 🙂 and I am happy to hear that you found the post interesting.

      I’ll start with a short answer: Use Dataflows for SIMPLE migration, when you want to import a list of records, for SIMPLE one-way integration into the CDS or when you want to gather data from different sources to prepare for analysis scenarios. The connectors are used in Power Apps and Power Automate. If compared with Dataflows – use a Power Automate flows and a connector for triggered based scenarios. There are also others ways e.g. logic apps and third-party tools for integration scenarios.

      And here is a long answer as well:
      Dataflows is for loading/importing data into the Common Data Service (or an Azure Data Lake Gen2). You cannot manipulate the data in the source, it is a one way direction into the CDS. You can however work with “Data Transformation” in Power Query, which means that you prepare data and “transform” it before you load it to the CDS. It is not changed in the source, but you can e.g. make sure you match the correct data type in the destination, you can combine information from different sources, create a new column based on other columns etc. All this “in the middle”, temporarily just to get it the way you want it and need it in the CDS. If you actually want to manipulate data in the source – then you cannot use a Dataflow, but you could e.g. use Power Automate and utilize a connector which connects to the data source in which you need to manipulate data. If it is a two-way integration you want, then you can not use Dataflows, traditionally two popular third-party tools are Scribe and Kingswaysoft SSIS Integration toolkit. Dataflows can be used as a no-code alternative for loading data into the CDS, manually or on a scheduled basis. It can be manually in a migration scenario, to import a list of records one time or on a scheduled basis e.g. when you want to create new records and or maintain maintain info on records in the CDS from another source and you actually need the records in the CDS. I am not really sure exactly what you mean with when to use connectors, but if you compare with Power Automate and flows (with which you use connectors) I would say you probably use that more for triggered based scenarios. E.g. somethings happens in one system and you want to send a notification or manipulate data in another system (or the same, could e.g. be used for creating logic in your Power Apps).

      Like

  2. Great blog Carina, this is really helpful. Thanks for sharing. I guess the issues you had with solutions will be addressed by MS at some point.

    Like

    1. Thank you Pavlos! Yes, I assume so. Piece by piece things get improved all the time. For now I am happy to see a start of ALM for Dataflows and I will be even more happier when it works perfectly.

      Like

Leave a comment