|
|||
|
there is a table with the following data:
col1 col2 A B B C Z A Z D M N M P X Y columns col1 and col2 are in parent-child relation. from the above table one can conclude the following three hierarchies: Z-A-B-C M X \ / \ | D N P Y the problem: is there a way to get the following result? col1 col3 A Z/A/B/C/D B Z/A/B/C/D Z Z/A/B/C/D M M/N/P X X/Y or col1 col3 A Z/A/B/C/D B Z/A/B/C/D C Z/A/B/C/D D Z/A/B/C/D Z Z/A/B/C/D M M/N/P N M/N/P P M/N/P X X/Y Y X/Y the order of concatenated values in col3 is generally irrelevant but all values belonging to a hierachy must be present. to put it in another words: how to consolidate values which form a tree but are not connected with the help of any other common value (like in those examples in oracle documentation about concatenating last names of all employees from department 10 using hierarchical queries) I would appreciate any suggestion how to do this. thank you, geos ps. this is crosspost but FollowUp-To is set to oracle.misc |
|
|
||||
|
||||
|
|
|
|||
|
Gerard H. Pille wrote:
> geos wrote: >> >> I would appreciate any suggestion how to do this. >> > > Use a tree walk (hierarchical query) to find the parent, if any, and > another to find all children of that parent. yep, it's so easy to write that much much easier than to implementacceptable solution based on those hierarchical queries and keeping an eye on performance... thank you geos |
|
|||
|
Gerard H. Pille wrote:
> Use a tree walk (hierarchical query) to find the parent, if any, and > another to find all children of that parent. closer inspection of the data reveals that those are not trees or hierarchies but oriented graphs, some without root node. this makes it more complicated. is there any way to identify all nodes which compose a graph for each edge entry in the table? thank you, geos |
|
|||
|
geos wrote:
> Gerard H. Pille wrote: >> Use a tree walk (hierarchical query) to find the parent, if any, and another to find all >> children of that parent. > > closer inspection of the data reveals that those are not trees or hierarchies but oriented > graphs, some without root node. this makes it more complicated. > I quote: "columns col1 and col2 are in parent-child relation. from the above table one can conclude the following three hierarchies" This was wrong, then? In which case, please provide a correct example of your data. |
|
|||
|
geos wrote:
> Gerard H. Pille wrote: >> geos wrote: >>> >>> I would appreciate any suggestion how to do this. >>> >> >> Use a tree walk (hierarchical query) to find the parent, if any, and another to find all >> children of that parent. > > yep, it's so easy to write that much much easier than to implement acceptable solution based> on those hierarchical queries and keeping an eye on performance... > I've been keeping both my eyes on performance in programming for over 35 years now. Of course, I can only do as much as your design allows. Given the correct indexing, tree walks can be performant enough. You wouldn't want me to write the select statement for you, would you? |
|
|||
|
Gerard H. Pille wrote:
> I quote: "columns col1 and col2 are in parent-child relation. from the > above table one can conclude the following three hierarchies" > > This was wrong, then? yes, I corrected my initial post adding the information about graphs instead of hierarchies. > In which case, please provide a correct example > of your data. It wasn't incorrect, it was just incomplete. the solution should do fine with the examples shown in initial post. thank you, geos |
|
|||
|
Gerard H. Pille wrote:
> I've been keeping both my eyes on performance in programming for over 35 > years now. Of course, I can only do as much as your design allows. > Given the correct indexing, tree walks can be performant enough. You > wouldn't want me to write the select statement for you, would you? sure I would. why wouln't ask for more explicit help when struggling with the problem for some time? writing a statement that can be analyzed is great help too, especially a query from someone with such big experience compared to mine. the information about hierarchical queries was obvious from the beginning. anyway, if you would like to help here is a complete script that fills table with sample data and expected result: http://geos2005.republika.pl/table.sql http://geos2005.republika.pl/result.jpg GRP column in the table is just helpful for spotting a groups of records but it can not be used in a query. generally the entries in the table are edges of a graph, can be many nodes in a graph, can be connected in any way between each other allowing loops. thank you, geos |
|
|||
|
Am 21.06.2012 22:51, schrieb geos:
> Gerard H. Pille wrote: >> I've been keeping both my eyes on performance in programming for over 35 >> years now. Of course, I can only do as much as your design allows. Given >> the correct indexing, tree walks can be performant enough. You wouldn't >> want me to write the select statement for you, would you? > > sure I would. Are you then, in turn, going to share part of your pay check with Gerard? ;-) > anyway, if you would like to help here is a complete script that fills table > with sample data and expected result: > > http://geos2005.republika.pl/table.sql That is not a database table. It has no primary key, no constraints, no indexes... It is just a dumped Excel table. Any query against it will always do full table scans. Regards Peter -- Climb the mountain not to plant your flag, but to embrace the challenge, enjoy the air and behold the view. Climb it so you can see the world, not so the world can see you. -- David McCullough Jr. |
|
|||
|
Peter Schneider wrote:
> Are you then, in turn, going to share part of your pay check with > Gerard? ;-) do you think this is the only criteria people help other people? but yes, I would consider that if I knew more details. > That is not a database table. It has no primary key, no constraints, no > indexes... It is just a dumped Excel table. Any query against it will > always do full table scans. it doesn't have to have, because there is no constraints nor primary key. but please do indexes as you like or add whatever you think would help you to approach a problem. this is the data you have. thank you, geos |
|
|||
|
geos wrote:
> Gerard H. Pille wrote: >> I've been keeping both my eyes on performance in programming for over 35 years now. Of course, >> I can only do as much as your design allows. Given the correct indexing, tree walks can be >> performant enough. You wouldn't want me to write the select statement for you, would you? > > sure I would. why wouln't ask for more explicit help when struggling with the problem for some > time? writing a statement that can be analyzed is great help too, especially a query from > someone with such big experience compared to mine. the information about hierarchical queries > was obvious from the beginning. > > anyway, if you would like to help here is a complete script that fills table with sample data > and expected result: > > http://geos2005.republika.pl/table.sql > http://geos2005.republika.pl/result.jpg > > GRP column in the table is just helpful for spotting a groups of records but it can not be used > in a query. generally the entries in the table are edges of a graph, can be many nodes in a > graph, can be connected in any way between each other allowing loops. > > thank you, > geos > I'm not sure I'll be able to help you. I did understand your original requirements, but now you've lost me. I do not know what "edges of a graph" are. What are columns p and c in your table? What do they contain? |
|
|||
|
Am 22.06.2012 02:03, schrieb geos:
> Peter Schneider wrote: >> Are you then, in turn, going to share part of your pay check with Gerard? ;-) > > do you think this is the only criteria people help other people? but yes, I > would consider that if I knew more details. > >> That is not a database table. It has no primary key, no constraints, no >> indexes... It is just a dumped Excel table. Any query against it will always >> do full table scans. > > it doesn't have to have, because there is no constraints nor primary key. but I strongly disagree. Each table in a relational database has to have at least a primary key (and it should better be one with inherent meaning, not just a sequence). Otherwise it is not a relational table, but just a data dump carrying no meaning whatsoever. Please consider: if you don't have a PK, then not even the uniqueness of a complete record is guarenteed, records can just be duplicated as often as coincedence, bad luck or program bugs do like to make it happen. And now think further what happens if you join such a "table" to other table in complex queries, e.g. in aggregation queries computing sums and such. Get the point? You should better rethink you design, or you will run into big data consistency problems later. Trust me, I've seen such shit happen with bad design. And if you have a tree model, you should be able to answer these question: - can an entity instance which occurs as a child record in the tree be uniquely identified only because of its position in the tree hierarchy or is in some way uniquey by itself? - is it allowed to have multiple instances of the same child element to appear at the same level (as children of the same parent) or is this disallowed? - is it allowed to have multiple roots in the tree, or is only one root allowed? All these findings need to be enforced by constraints, otherwise your tree data is going to violate the rules at some point in time or the other. If you haven't thought yet about such kind of questions, my guess is that you probably haven't thought deep enough about the entities in real life that your DB is trying to model. Regards Peter -- Climb the mountain not to plant your flag, but to embrace the challenge, enjoy the air and behold the view. Climb it so you can see the world, not so the world can see you. -- David McCullough Jr. |
|
|||
|
Gerard H. Pille wrote:
> I'm not sure I'll be able to help you. I did understand your original > requirements, but now you've lost me. I do not know what "edges of a > graph" are. What are columns p and c in your table? What do they contain? what is not clear to you? I thought everything was said and shown. write down a few dots. connect them with arrows any way you like. that's a graph. an arrow corresponds to edge. an arrow introduces orientation aka parent-child relation. assign a letter to every dot. now you can describe the edge following the arrow, like FROM a TO b. the table contain edges, ie. entries FROM a TO b, FROM 7 TO 11 etc. the task is to identify all graphs and name them. the naming convention is that you pick any "dot name" composing the graph and assign it to graph. you can identify a graph only by identifying all edges belonging to it. a grouping column was introduced for debugging purposes only. it helps you to easily spot edges which form a graph so that you can compare the output of your algorithm at a glance. thank you, geos |
|
|||
|
Peter Schneider wrote:
> I strongly disagree. Each table in a relational database has to have at > least a primary key [...] > [...]If you haven't thought yet about such kind of questions, my guess is > that you probably haven't thought deep enough about the entities in real > life that your DB is trying to model. you said many interesting things of which no one makes it closer to solve the problem. and it's not the point of my post either. imagine there is a table, that you're not an owner, that you're not designer, that you don't even intend to model anything etc. you're just a guy who likes solving puzzles. you were given this table. it has some structure. it's a good structure, it has some business meaning for someone (remember: you're not designer). and imagine that there is a quite a prize for just solving the puzzle. not for asking questions and bringing up other interesting, but unrelated subjects (they give minus points for this and call IRS). so, can you solve the problem with pure SQL? if there is anything that needs to be explained about the problem please ask, I will explain in more datails. but I think that the data set I presented along with the expected results are self-explanatory. thank you, geos |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|