|
|
Welcome to the Invelos forums. Please read the forum
rules before posting.
Read access to our public forums is open to everyone. To post messages, a free
registration is required.
If you have an Invelos account, sign in to post.
|
|
|
|
Invelos Forums->DVD Profiler: Contribution Discussion |
Page:
1 2 Previous Next
|
Parsing |
|
|
|
Author |
Message |
Registered: March 29, 2007 | Reputation: | Posts: 4,479 |
| Posted: | | | | Quoting m.cellophane: Quote:
If the program could ignore parsing differences, I think we'd be on a good path to eliminating many of the issues. Unfortunately, linking problems do not come only from parsing. As I already wrote in another thread, linking problems are due to different factors. Seeing my collection and problems I had to solve, the reasons are : 1/ different credits for the same actor, with a contributor not using Common name from CLT 2/ bad transcription of capitalized letters, omitting accents 3/ typos by contributor when copying credits 4/ asian names that are sometimes in asian order, sometimes in western order (Gong Li/Li Gong, Zhang Ziyi/Ziyi Zhang) 5/ different actors with same name, without birth year Those reasons are more than 80% of linking problems. Ignoring parsing will solve nothing for those. 6/ incorrect evident parsing by contributors ignoring rules on titles and articles... About 15% of linking problems, that could probably be solved by automatic filters 7/ difficult parsing : about 5% I'm not against a ignoring parsing solution. But in fact it will solve a very little percentage of linking problems. | | | Images from movies |
| Registered: March 13, 2007 | Reputation: | Posts: 6,635 |
| Posted: | | | | Quoting surfeur51: Quote: I'm not against a ignoring parsing solution. But in fact it will solve a very little percentage of linking problems. More importantly, it would eliminate these endless, and useless, pages and pages of debates on parsing! .......or maybe not! | | | Hal |
| Registered: March 13, 2007 | Reputation: | Posts: 2,293 |
| Posted: | | | | Quoting lyonsden5: Quote: The only thing that would (IMO) is an amendment to the rules to give direction. True, but I thought we had that at one point, or at least a widely followed agreement (though vociferously objected to my some of course!), that we put first name in first field, last name in last field and anything else in the middle field and we totally ignored whether the person in question thought of (say) their last two names as a surnmae or not. However, the idea that the program could automatically ignore parsing for linking purposes would solve a huge portion of the problem and i think it's an excellent idea. | | | It is dangerous to be right in matters where established men are wrong |
| Registered: March 13, 2007 | Reputation: | Posts: 1,774 |
| Posted: | | | | Quoting hal9g: Quote: More importantly, it would eliminate these endless, and useless, pages and pages of debates on parsing!
.......or maybe not! Don't worry, we have a lot of other problems that could be "discussed" ( = could be used for flame wars): Asian names and Japanese romanization, CLT results and how to interpret them, copy & paste of cast/crew from a different profile, aspect ratio (actual vs. rounded); audio tracks that are available but not selectable via menu... Ok, ontopic: I don't understand why we need a "standard" for parsing. If you start with 1/2/3 or 1/2 3 and then document a change to the opposite doesn't make a difference. Unless any data is somehow verified it could be wrong. So there is no use of saying 1/2/3 has to be a standard or 1/2 3 has to be a standard, because both could be wrong without further verification. The database won't be any more correct by using 1/2/3 as a start, as it would be by using 1/2 3 as a start. IMHO this whole discussion is much ado about nothing. So why lose so many time for endless debates, this time could be used to actual verifiy or change the parsing of a few actors, where it is neccessary. | | | Last edited: by SpaceFreakMicha |
| Registered: March 13, 2007 | Posts: 2,759 |
| Posted: | | | | Quoting Voltaire53: Quote:
True, but I thought we had that at one point, or at least a widely followed agreement (though vociferously objected to my some of course!), that we put first name in first field, last name in last field and anything else in the middle field and we totally ignored whether the person in question thought of (say) their last two names as a surnmae or not. No, this word counting has been developed in an external forum of an unofficial self-appointed rules committee and never made it to the official Invelos rules nor has it ever been a general consensus in this forum. |
| Registered: March 15, 2007 | Posts: 374 |
| Posted: | | | | If there was another, intelligent mechanism to link cast & crew (and some good ideas have been brought forward here in the forum) then parsing and the way you write a persons name doesn't matter.
I think this would be a big improvement. And the end of a lot of (partly unfruitful) discussions. |
| Registered: March 13, 2007 | Posts: 21,610 |
| Posted: | | | | Quoting Voltaire53: Quote: Quoting lyonsden5:
Quote: The only thing that would (IMO) is an amendment to the rules to give direction.
True, but I thought we had that at one point, or at least a widely followed agreement (though vociferously objected to my some of course!), that we put first name in first field, last name in last field and anything else in the middle field and we totally ignored whether the person in question thought of (say) their last two names as a surnmae or not.
However, the idea that the program could automatically ignore parsing for linking purposes would solve a huge portion of the problem and i think it's an excellent idea. Quite right, Voltaire, I think that most of us have been following this. But unfortunately most isn't good enough, all it takes is ONE user (Rho) who decides he doesn't want to do it , but wants to do it his way instead to make a complete hash out of everything. Skip | | | ASSUME NOTHING!!!!!! CBE, MBE, MoA and proud of it. Outta here
Billy Video |
| Registered: March 13, 2007 | Reputation: | Posts: 1,774 |
| Posted: | | | | Quoting Dr Pavlov: Quote: Quoting Voltaire53:
Quote: Quoting lyonsden5:
Quote: The only thing that would (IMO) is an amendment to the rules to give direction.
True, but I thought we had that at one point, or at least a widely followed agreement (though vociferously objected to my some of course!), that we put first name in first field, last name in last field and anything else in the middle field and we totally ignored whether the person in question thought of (say) their last two names as a surnmae or not.
However, the idea that the program could automatically ignore parsing for linking purposes would solve a huge portion of the problem and i think it's an excellent idea. Quite right, Voltaire, I think that most of us have been following this. But unfortunately most isn't good enough, all it takes is ONE user (Rho) who decides he doesn't want to do it , but wants to do it his way instead to make a complete hash out of everything.
Skip Could you please provide us a link for this "agreement"? | | | Last edited: by SpaceFreakMicha |
| Registered: March 13, 2007 | Reputation: | Posts: 13,202 |
| Posted: | | | | Quoting SpaceFreakMicha: Quote: Ok, ontopic: I don't understand why we need a "standard" for parsing. If you start with 1/2/3 or 1/2 3 and then document a change to the opposite doesn't make a difference. Unless any data is somehow verified it could be wrong. So there is no use of saying 1/2/3 has to be a standard or 1/2 3 has to be a standard, because both could be wrong without further verification.
The database won't be any more correct by using 1/2/3 as a start, as it would be by using 1/2 3 as a start.
IMHO this whole discussion is much ado about nothing. So why lose so many time for endless debates, this time could be used to actual verifiy or change the parsing of a few actors, where it is neccessary. The reason it is an issue is linking. Let me try and explain using Robin Wright Penn. Let's say you enter her in a profile, as Robin/ /Wright Penn and Skip enters her in a different profile as Robin/Wright/Penn. When I download those two profiles, I will have two different actor entries. If I double click on one, the profile...or profiles...with the other will not come up. For them to link, I would have to change one of the names. For a lot of people, this is unacceptable. | | | No dictator, no invader can hold an imprisoned population by force of arms forever. There is no greater power in the universe than the need for freedom. Against this power, governments and tyrants and armies cannot stand. The Centauri learned this lesson once. We will teach it to them again. Though it take a thousand years, we will be free. - Citizen G'Kar |
| Registered: March 13, 2007 | Reputation: | Posts: 1,774 |
| Posted: | | | | Quoting TheMadMartian: Quote: The reason it is an issue is linking. Let me try and explain using Robin Wright Penn.
Let's say you enter her in a profile, as Robin/ /Wright Penn and Skip enters her in a different profile as Robin/Wright/Penn. When I download those two profiles, I will have two different actor entries. If I double click on one, the profile...or profiles...with the other will not come up. For them to link, I would have to change one of the names. For a lot of people, this is unacceptable. I see the problem, but I can't see why 1/2/3 is any better as a starting point as 1/2 3. |
| Registered: January 1, 2009 | Reputation: | Posts: 3,087 |
| Posted: | | | | Quoting eaglejd: Quote: .... How is this parsed?
List/ of/ Accepted/ Parsed/ Names/ with/ Documentation?
List of Accepted Parsed//Names with Documentation?
List of Accepted/ Parsed Names/ with Documentation?
Serious? Of course it's a stage name so it is List of Accepted Parsed Names with Documentation// |
| Registered: March 13, 2007 | Reputation: | Posts: 13,202 |
| Posted: | | | | Quoting SpaceFreakMicha: Quote: I see the problem, but I can't see why 1/2/3 is any better as a starting point as 1/2 3. I don't know that it is any better, neutral, but not better. As I said in another thread, it really doesn't matter to me as I don't care about linking, but I would like it to work for those that do and a set starting point is the only solution that I can think of. | | | No dictator, no invader can hold an imprisoned population by force of arms forever. There is no greater power in the universe than the need for freedom. Against this power, governments and tyrants and armies cannot stand. The Centauri learned this lesson once. We will teach it to them again. Though it take a thousand years, we will be free. - Citizen G'Kar |
| Registered: March 13, 2007 | Posts: 21,610 |
| Posted: | | | | Quoting SpaceFreakMicha: Quote: Quoting TheMadMartian:
Quote: The reason it is an issue is linking. Let me try and explain using Robin Wright Penn.
Let's say you enter her in a profile, as Robin/ /Wright Penn and Skip enters her in a different profile as Robin/Wright/Penn. When I download those two profiles, I will have two different actor entries. If I double click on one, the profile...or profiles...with the other will not come up. For them to link, I would have to change one of the names. For a lot of people, this is unacceptable.
I see the problem, but I can't see why 1/2/3 is any better as a starting point as 1/2 3. I have explained this many time, Space. Skip | | | ASSUME NOTHING!!!!!! CBE, MBE, MoA and proud of it. Outta here
Billy Video |
| Registered: March 13, 2007 | Posts: 1,414 |
| Posted: | | | | The linking should ignore capitalization too, just like the contribution system and the CLT do. | | | "This movie has warped my fragile little mind." |
| Registered: March 13, 2007 | Reputation: | Posts: 6,635 |
| Posted: | | | | Quoting gardibolt: Quote: The linking should ignore capitalization too, just like the contribution system and the CLT do. Good point, except locally, I think you can only have one version of capitalization, and linking happens in your local db. | | | Hal | | | Last edited: by hal9g |
| Registered: March 13, 2007 | Reputation: | Posts: 13,202 |
| Posted: | | | | Quoting gardibolt: Quote: The linking should ignore capitalization too, just like the contribution system and the CLT do. It already does, kinda. You can't have both Danny DeVito and Danny Devito in your local db. If you have DeVito, and download a profile with Devito, the local program will use DeVito. Parsing, in my opinion, should be done exactly the same way...though I don't know how difficult that would be to program. | | | No dictator, no invader can hold an imprisoned population by force of arms forever. There is no greater power in the universe than the need for freedom. Against this power, governments and tyrants and armies cannot stand. The Centauri learned this lesson once. We will teach it to them again. Though it take a thousand years, we will be free. - Citizen G'Kar |
|
|
Invelos Forums->DVD Profiler: Contribution Discussion |
Page:
1 2 Previous Next
|
|
|
|
|
|
|
|
|