|
|||
|
Hi, I am migrating from PHP to Python and I am slightly confused.
I am making a function that takes a startingList, finds all the strings in the list that begin with 'x', removes those strings and puts them into a xOnlyList. However if you run the code you will notice only one of the strings beginning with 'x' is removed from the startingList. If I comment out 'startingList.remove(str);' the code runs with both strings beginning with 'x' being put in the xOnlyList. Using the print statement I noticed that the second string that begins with 'x' isn't even identified by the function. Why does this happen? def testFunc(startingList): xOnlyList = []; for str in startingList: if (str[0] == 'x'): print str; xOnlyList.append(str) startingList.remove(str) #this seems to be the problem print xOnlyList; print startingList testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews']) #Thanks for your help! |
|
|
||||
|
||||
|
|
|
|||
|
light1quark@gmail.com writes:
> However if you run the code you will notice only one of the strings > beginning with 'x' is removed from the startingList. > > def testFunc(startingList): > xOnlyList = []; > for str in startingList: > if (str[0] == 'x'): > print str; > xOnlyList.append(str) > startingList.remove(str) #this seems to be the problem > print xOnlyList; > print startingList > testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews']) > > #Thanks for your help! Try with ['xasd', 'sefwr', 'xjkl', 'dfsews'] and you'll understand what happens. Also, have a look at: http://docs.python.org/reference/com...-for-statement You can't modify the list you're iterating on, better use another list to collect the result. -- Alain. P/S: str is a builtin, you'd better avoid assigning to it. |
|
|||
|
On 8/14/2012 11:59 AM, Alain Ketterlin wrote:
> light1quark@gmail.com writes: > >> However if you run the code you will notice only one of the strings >> beginning with 'x' is removed from the startingList. > >> >> def testFunc(startingList): >> xOnlyList = []; >> for str in startingList: >> if (str[0] == 'x'): >> print str; >> xOnlyList.append(str) >> startingList.remove(str) #this seems to be the problem >> print xOnlyList; >> print startingList >> testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews']) >> >> #Thanks for your help! > > Try with ['xasd', 'sefwr', 'xjkl', 'dfsews'] and you'll understand what > happens. Also, have a look at: > > http://docs.python.org/reference/com...-for-statement > > You can't modify the list you're iterating on, Except he obviously did ;-). (Modifying set or dict raises SomeError.) Indeed, people routine *replace* items while iterating. def squarelist(lis): for i, n in enumerate(lis): lis[i] = n*n return lis print(squarelist([0,1,2,3,4,5])) # [0, 1, 4, 9, 16, 25] Removals can be handled by iterating in reverse. This works even with duplicates because if the item removed is not the one tested, the one tested gets retested. def removeodd(lis): for n in reversed(lis): if n % 2: lis.remove(n) print(n, lis) ll = [0,1, 5, 5, 4, 5] removeodd(ll) >>> 5 [0, 1, 5, 4, 5] 5 [0, 1, 4, 5] 5 [0, 1, 4] 4 [0, 1, 4] 1 [0, 4] 0 [0, 4] > better use another list to collect the result. If there are very many removals, a new list will be faster, even if one needs to copy the new list back into the original, as k removals from len n list is O(k*n) versus O(n) for new list and copy. > P/S: str is a builtin, you'd better avoid assigning to it. Agreed. People have actually posted code doing something like .... list = [1,2,3] .... z = list(x) .... and wondered and asked why it does not work. -- Terry Jan Reedy |
|
|||
|
I got my answer by reading your posts and referring to: http://docs.python.org/reference/com...-for-statement
(particularly the shaded grey box) I guess I should have (obviously) looked at the doc's before posting here; but im a noob. Thanks for your help. |
|
|||
|
On 2012-08-14 17:38, light1quark@gmail.com wrote:
> Hi, I am migrating from PHP to Python and I am slightly confused. > > I am making a function that takes a startingList, finds all the strings in the list that begin with 'x', removes those strings and puts them into a xOnlyList. > > However if you run the code you will notice only one of the strings beginning with 'x' is removed from the startingList. > If I comment out 'startingList.remove(str);' the code runs with both strings beginning with 'x' being put in the xOnlyList. > Using the print statement I noticed that the second string that begins with 'x' isn't even identified by the function. Why does this happen? > > def testFunc(startingList): > xOnlyList = []; > for str in startingList: > if (str[0] == 'x'): > print str; > xOnlyList.append(str) > startingList.remove(str) #this seems to be the problem > print xOnlyList; > print startingList > testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews']) > > #Thanks for your help! You might find the following useful: def testFunc(startingList): xOnlyList = []; j = -1 for xl in startingList: if (xl[0] == 'x'): xOnlyList.append(xl) else: j += 1 startingList[j] = xl if j == -1: startingList = [] else: del startingList[j:-1] return(xOnlyList) testList1 = ['xasd', 'xjkl', 'sefwr', 'dfsews'] testList2 = ['xasd', 'xjkl', 'xsefwr', 'xdfsews'] testList3 = ['xasd', 'jkl', 'sefwr', 'dfsews'] testList4 = ['asd', 'jkl', 'sefwr', 'dfsews'] xOnlyList = testFunc(testList1) print 'xOnlyList = ',xOnlyList print 'testList = ',testList1 xOnlyList = testFunc(testList2) print 'xOnlyList = ',xOnlyList print 'testList = ',testList2 xOnlyList = testFunc(testList3) print 'xOnlyList = ',xOnlyList print 'testList = ',testList3 xOnlyList = testFunc(testList4) print 'xOnlyList = ',xOnlyList print 'testList = ',testList4 And here is another version using list comprehension that I prefer testList1 = ['xasd', 'xjkl', 'sefwr', 'dfsews'] testList2 = ['xasd', 'xjkl', 'xsefwr', 'xdfsews'] testList3 = ['xasd', 'jkl', 'sefwr', 'dfsews'] testList4 = ['asd', 'jkl', 'sefwr', 'dfsews'] def testFunc2(startingList): return([x for x in startingList if x[0] == 'x'], [x for x in startingList if x[0] != 'x']) xOnlyList,testList = testFunc2(testList1) print xOnlyList print testList xOnlyList,testList = testFunc2(testList2) print xOnlyList print testList xOnlyList,testList = testFunc2(testList3) print xOnlyList print testList xOnlyList,testList = testFunc2(testList4) print xOnlyList print testList |
|
|||
|
On Wed, Aug 15, 2012 at 1:38 AM, <light1quark@gmail.com> wrote:
> def testFunc(startingList): > xOnlyList = []; > for str in startingList: > if (str[0] == 'x'): > print str; > xOnlyList.append(str) > startingList.remove(str) #this seems to be the problem > print xOnlyList; > print startingList > testFunc(['xasd', 'xjkl', 'sefwr', 'dfsews']) Other people have explained the problem with your code. I'll take this example as a way of introducing you to one of Python's handy features - it's an idea borrowed from functional languages, and is extremely handy. It's called the "list comprehension", and can be looked up in the docs under that name, def testFunc(startingList): xOnlyList = [strng for strng in startingList if strng[0] == 'x'] startingList = [strng for strng in startingList if strng[0] != 'x'] print(xOnlyList) print(startingList) It's a compact notation for building a list from another list. (Note that I changed "str" to "strng" to avoid shadowing the built-in name "str", as others suggested.) (Unrelated side point: Putting parentheses around the print statements makes them compatible with Python 3, in which 'print' is a function. Unless something's binding you to Python 2, consider working with the current version - Python 2 won't get any more features added to it any more.) Python's an awesome language. You may have to get your head around a few new concepts as you shift thinking from PHP's, but it's well worth while. Chris Angelico |
|
|||
|
On Tue, 14 Aug 2012 21:40:10 +0200, Virgil Stokes wrote:
> You might find the following useful: > > def testFunc(startingList): > xOnlyList = []; j = -1 > for xl in startingList: > if (xl[0] == 'x'): That's going to fail in the starting list contains an empty string. Use xl.startswith('x') instead. > xOnlyList.append(xl) > else: > j += 1 > startingList[j] = xl Very cunning, but I have to say that your algorithm fails the "is this obviously correct without needing to study it?" test. Sometimes that is unavoidable, but for something like this, there are simpler ways to solve the same problem. > if j == -1: > startingList = [] > else: > del startingList[j:-1] > return(xOnlyList) > And here is another version using list comprehension that I prefer > def testFunc2(startingList): > return([x for x in startingList if x[0] == 'x'], [x for x in > startingList if x[0] != 'x']) This walks over the starting list twice, doing essentially the same thing both times. It also fails to meet the stated requirement that startingList is modified in place, by returning a new list instead. Here's an example of what I mean: py> mylist = mylist2 = ['a', 'x', 'b', 'xx', 'cx'] # two names for one list py> result, mylist = testFunc2(mylist) py> mylist ['a', 'b', 'cx'] py> mylist2 # should be same as mylist ['a', 'x', 'b', 'xx', 'cx'] Here is the obvious algorithm for extracting and removing words starting with 'x'. It walks the starting list only once, and modifies it in place. The only trick needed is list slice assignment at the end. def extract_x_words(words): words_with_x = [] words_without_x = [] for word in words: if word.startswith('x'): words_with_x.append(word) else: words_without_x.append(word) words[:] = words_without_x # slice assignment return words_with_x The only downside of this is that if the list of words is so enormous that you can fit it in memory *once* but not *twice*, this may fail. But the same applies to the list comprehension solution. -- Steven |
|
|||
|
Chris Angelico <rosuav@gmail.com> writes:
> Other people have explained the problem with your code. I'll take this > example as a way of introducing you to one of Python's handy features > - it's an idea borrowed from functional languages, and is extremely > handy. It's called the "list comprehension", and can be looked up in > the docs under that name, > > def testFunc(startingList): > xOnlyList = [strng for strng in startingList if strng[0] == 'x'] > startingList = [strng for strng in startingList if strng[0] != 'x'] > print(xOnlyList) > print(startingList) > > It's a compact notation for building a list from another list. (Note > that I changed "str" to "strng" to avoid shadowing the built-in name > "str", as others suggested.) Fully agree with you: list comprehension is, imo, the most useful program construct ever. Extremely useful. But not when it makes the program traverse twice the same list, where one traversal is enough. -- Alain. |
|
|||
|
light1quark@gmail.com writes:
> I got my answer by reading your posts and referring to: > http://docs.python.org/reference/com...-for-statement > (particularly the shaded grey box) Not that the problem is not specific to python (if you erase the current element when traversing a STL list in C++ you'll get a crash as well). > I guess I should have (obviously) looked at the doc's before posting > here; but im a noob. Python has several surprising features. I think it is a good idea to take some time to read the language reference, from cover to cover (before or after the various tutorials, depending on your background). -- Alain. |
|
|||
|
On 15-Aug-2012 02:19, Steven D'Aprano wrote:
> On Tue, 14 Aug 2012 21:40:10 +0200, Virgil Stokes wrote: > >> You might find the following useful: >> >> def testFunc(startingList): >> xOnlyList = []; j = -1 >> for xl in startingList: >> if (xl[0] == 'x'): > That's going to fail in the starting list contains an empty string. Use > xl.startswith('x') instead. Yes, but this was by design (tacitly assumed that startingList was both a list and non-empty). > > >> xOnlyList.append(xl) >> else: >> j += 1 >> startingList[j] = xl > Very cunning, but I have to say that your algorithm fails the "is this > obviously correct without needing to study it?" test. Sometimes that is > unavoidable, but for something like this, there are simpler ways to solve > the same problem. Sorry, but I do not sure what you mean here. > > >> if j == -1: >> startingList = [] >> else: >> del startingList[j:-1] >> return(xOnlyList) > >> And here is another version using list comprehension that I prefer >> def testFunc2(startingList): >> return([x for x in startingList if x[0] == 'x'], [x for x in >> startingList if x[0] != 'x']) > This walks over the starting list twice, doing essentially the same thing > both times. It also fails to meet the stated requirement that > startingList is modified in place, by returning a new list instead. This can meet the requirement that startingList is modified in place via the call to this function (see the attached code). > Here's an example of what I mean: > > py> mylist = mylist2 = ['a', 'x', 'b', 'xx', 'cx'] # two names for one > list > py> result, mylist = testFunc2(mylist) > py> mylist > ['a', 'b', 'cx'] > py> mylist2 # should be same as mylist > ['a', 'x', 'b', 'xx', 'cx'] Yes, I had a typo in my original posting --- sorry about that! > > Here is the obvious algorithm for extracting and removing words starting > with 'x'. It walks the starting list only once, and modifies it in place. > The only trick needed is list slice assignment at the end. > > def extract_x_words(words): > words_with_x = [] > words_without_x = [] > for word in words: > if word.startswith('x'): > words_with_x.append(word) > else: > words_without_x.append(word) > words[:] = words_without_x # slice assignment > return words_with_x Suppose words was not a list --- you have tacitly assumed that words is a list. > > The only downside of this is that if the list of words is so enormous > that you can fit it in memory *once* but not *twice*, this may fail. But > the same applies to the list comprehension solution. But, this is not the only downside if speed is important --- it is slower than the list comprehension method (see results that follows). Here is a summary of three algorithms (algorithm-1, algorithm-2, algorithm-2A) that I tested (see attached code). Note, algorithm-2A was obtained by removing the slice assignment in the above code and modifying the return as follows def extract_x_words(words): words_with_x = [] words_without_x = [] for word in words: if word.startswith('x'): words_with_x.append(word) else: words_without_x.append(word) #words[:] = words_without_x # slice assignment return words_with_x, words_without_x Of course, one needs to modify the call for "in-place" update of startingList as follows: xOnlyList,startingList = extract_x_words(startingList) Here is a summary of my timing results obtained for 3 different algorithms for lists with 100,000 strings of length 4 in each list: Method average (sd) time in seconds algorithm-1 (list comprehension) 0.11630 (0.0014) algorithm-2 (S. D'Aprano) 0.17594 (0.0014) algorithm-2A (modified S. D'Aprano) 0.18217 (0.0023) These values were obtained from 100 independent runs (MC simulations) on lists that contain 100,000 strings. Approximately 50% of these strings contained a leading 'x'. Note, that the results show that algorithm-2 (suggested by S. D'Aprano) is approximately 51% slower than algorithm-1 (list comprehensions) and algorithm-2A (simple modification of algorithm-2) is approximately 57% slower than algorithm-1. Why is algorithm-2A slower than algorithm-2? I would be interested in seeing code that is faster than algorithm-1 --- any suggestions are welcomed. And of course, if there are any errors in my attached code please inform me of them and I will try to correct them as soon as possible. Note, some of the code is actually irrelevant for the original "Strange behavior" post. Have a good day! |
|
|||
|
Virgil Stokes wrote:
>>> def testFunc(startingList): >>>xOnlyList = []; j = -1 >>>for xl in startingList: >>>if (xl[0] == 'x'): >> That's going to fail in the starting list contains an empty string. Use >> xl.startswith('x') instead. > Yes, but this was by design (tacitly assumed that startingList was both a > list and non-empty). You missunderstood it will fail if the list contains an empty string, not if the list itself is empty: >>> words = ["alpha", "", "xgamma"] >>> [word for word in words if word[0] == "x"] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: string index out of range The startswith() version: >>> [word for word in words if word.startswith("x")] ['xgamma'] Also possible: >>> [word for word in words if word[:1] == "x"] ['xgamma'] > def testFunc1(startingList): > ''' > Algorithm-1 > Note: > One should check for an empty startingList before > calling testFunc1 -- If this possibility exists! > ''' > return([x for x in startingList if x[0] == 'x'], > [x for x in startingList if x[0] != 'x']) > > > I would be interested in seeing code that is faster than algorithm-1 In pure Python? Perhaps the messy variant: def test_func(words): nox = [] append = nox.append withx = [x for x in words if x[0] == 'x' or append(x)] return withx, nox |
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|