Go Back   Rhinocerus > Newsgroup > Newsgroup comp.lang.python

Reply
 
Thread Tools Display Modes
  #1 (permalink)  
Old 08-07-2012, 01:18 PM
Roy Smith
Guest
 
Posts: n/a
Default I thought I understood how import worked...

I've been tracking down some weird import problems we've been having with
django. Our settings.py file is getting imported twice. It has some
non-idempotent code in it, and we blow up on the second import.

I thought modules could not get imported twice. The first time they get
imported, they're cached, and the second import just gets you a reference to the
original. Playing around, however, I see that it's possible to import a module
twice if you refer to it by different names. Here's a small-ish test case which
demonstrates what I'm talking about (python 2.6.5):

In directory /home/roy/play/import/foo, I've got:

__init__.py (empty file)
try.py
broken.py


$ cat broken.py
print __file__


$ cat try.py
import broken
import foo.broken

import sys
for m in sys.modules.items():
if m[0].endswith('broken'):
print m


And when I run try.py (with foo as the current directory):

$ PYTHONPATH=/home/roy/play/import python try.py
/home/roy/play/import/foo/broken.pyc
/home/roy/play/import/foo/broken.pyc
('broken', <module 'broken' from '/home/roy/play/import/foo/broken.pyc'>)
('foo.broken', <module 'foo.broken' from '/home/roy/play/import/foo/broken.pyc'>)


So, it appears that you *can* import a module twice, if you refer to it by
different names! This is surprising. It means that having non-idempotent code
which is executed at import time is a Bad Thing.

It also means that you could have multiple copies of a module's global
namespace, depending on how your users imported the module. Which is kind of
mind-blowing.
Reply With Quote
Alt Today
Advertising
 
and become member of Rhinocerus
Standard Sponsored Links

  #2 (permalink)  
Old 08-07-2012, 01:52 PM
Steven D'Aprano
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On Tue, 07 Aug 2012 09:18:26 -0400, Roy Smith wrote:

> I thought modules could not get imported twice. The first time they get
> imported, they're cached, and the second import just gets you a
> reference to the original. Playing around, however, I see that it's
> possible to import a module twice if you refer to it by different names.


Yes. You've found a Python gotcha.

The most common example of this is when a single file acts as both an
importable module, and as a runnable script. When run as a script, it is
known as "__main__". When imported, it is known by the file name. Unless
you take care, it is easy to end up with the module imported twice.

The usual advice is "never have one module used as both script and
importable module". I think *never* is a bit strong, but if you do so,
you need to take extra care.

> Here's a small-ish test case which demonstrates what I'm talking about
> (python 2.6.5):
>
> In directory /home/roy/play/import/foo, I've got:
>
> __init__.py (empty file)
> try.py
> broken.py


Aside: calling a module "try.py" is asking for trouble, because you can't
do this:

import try


> $ cat broken.py
> print __file__
>
>
> $ cat try.py
> import broken
> import foo.broken


Which are two names for the same module.


> So, it appears that you *can* import a module twice, if you refer to it
> by different names! This is surprising. It means that having
> non-idempotent code which is executed at import time is a Bad Thing.


Well yes. In general, you should avoid non-idempotent code. You should
doubly avoid it during imports, and triply avoid it on days ending with Y.

The rest of the time, it is perfectly safe to have non-idempotent code.



I kid, of course, but only half. Side-effects are bad, non-idempotent
code is bad, and you should avoid them as much as possible, and unless
you have no other reasonable choice.


> It also means that you could have multiple copies of a module's global
> namespace, depending on how your users imported the module. Which is
> kind of mind-blowing.


Oh that part is trivial. Module namespaces are just dicts, there's
nothing special about them.

py> import math # for example
py> import copy
py> namespace = copy.deepcopy(math.__dict__)
py> math.__dict__ == namespace
True
py> math.__dict__ is namespace
False


It are modules which should be special, and Python tries really hard to
ensure that they are singletons. (Multitons?) But not superhumanly hard.


--
Steven
Reply With Quote
  #3 (permalink)  
Old 08-07-2012, 01:55 PM
Ben Finney
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

Roy Smith <roy@panix.com> writes:

> So, it appears that you *can* import a module twice, if you refer to
> it by different names! This is surprising.


The tutorial is misleading on this. It it says plainly:

A module can contain executable statements as well as function
definitions. […] They are executed only the *first* time the module
is imported somewhere.

<URL:http://docs.python.org/tutorial/modules.html>

but it doesn't make clear that a module can exist in the ‘sys.modules’
list multiple times under different names.

Care to file a documentation bug <URL:http://bugs.python.org/>
describing this?

> It means that having non-idempotent code which is executed at import
> time is a Bad Thing.


This is true whether or not the above about module imports is true. A
well-designed module should have top level code that performs idempotent
actions.

Be thankful that you've discovered this, and apply it well :-)

--
\ “See, in my line of work you gotta keep repeating things over |
`\ and over and over again, for the truth to sink in; to kinda |
_o__) catapult the propaganda.” —George W. Bush, 2005-05 |
Ben Finney
Reply With Quote
  #4 (permalink)  
Old 08-07-2012, 02:10 PM
Mark Lawrence
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On 07/08/2012 14:28, Ramchandra Apte wrote:
> I don't think the modules are actually imported twice. The entry is just
> doubled;that's all
>


Please don't top post, this is the third time of asking.

--
Cheers.

Mark Lawrence.

Reply With Quote
  #5 (permalink)  
Old 08-07-2012, 02:14 PM
Laszlo Nagy
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On 2012-08-07 15:55, Ben Finney wrote:
> Roy Smith <roy@panix.com> writes:
>
>> So, it appears that you *can* import a module twice, if you refer to
>> it by different names! This is surprising.

> The tutorial is misleading on this. It it says plainly:
>
> A module can contain executable statements as well as function
> definitions. […] They are executed only the *first* time the module
> is imported somewhere.
>
> <URL:http://docs.python.org/tutorial/modules.html>
>
> but it doesn't make clear that a module can exist in the ‘sys.modules’
> list multiple times under different names.

sys.modules is a dict. But yes, there can be multiple "instances" of the
same module loaded.

What I do with bigger projects is that I always use absolute module
names. For example, when I develop a project called "project1" that has
several sub packages, then I always do these kinds of imports:

from project1.package1.subpackage2.submodule3 import *
from project1.package1.subpackage2 import submodule3
from project1.package1.subpackage2.submodule3 import some_class

Even from a source file that is inside project1.package1.subpackage2, I
tend to import them the same way. This makes sure that every module is
imported under the same package path.

You just need to make sure that the main project has a unique name
(which is usually the case) and that it is on your sys path (which is
usually the case, especially when the script is started in the project's
directory).

The cost is that you have to type more. The benefit is that you can be
sure that you are importing the thing that you want to import, and there
will be no multiple imports for the same module.

Mabye somebody will give method that works even better.

For small projects without sub-packages, it is not a problem.

Best,

Laszlo

Reply With Quote
  #6 (permalink)  
Old 08-07-2012, 03:25 PM
Roy Smith
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On Tuesday, August 7, 2012 9:52:59 AM UTC-4, Steven D'Aprano wrote:

> In general, you should avoid non-idempotent code. You should
> doubly avoid it during imports, and triply avoid it on days ending with Y.


I don't understand your aversion to non-idempotent code as a general rule. Most code is non-idempotent. Surely you're not saying we should never write:

>>> foo += 1


or

>>> my_list.pop()


???

Making top-level module code idempotent, I can understand (given this new-found revelation that modules aren't really singletons), but you seem to be arguing something stronger and more general.
Reply With Quote
  #7 (permalink)  
Old 08-07-2012, 03:32 PM
Roy Smith
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On Tuesday, August 7, 2012 9:55:16 AM UTC-4, Ben Finney wrote:

> The tutorial is misleading on this. It it says plainly:
>
> A module can contain executable statements as well as function
> definitions. [] They are executed only the *first* time the module
> is imported somewhere.
>
> <URL:http://docs.python.org/tutorial/modules.html>


That's more than misleading. It's plain wrong. The example I gave demonstrates the "print __file__" statement getting executed twice.

The footnote to that is wrong too:

> [1] In fact function definitions are also statements that are executed; the execution of a
> module-level function enters the function name in the modules global symbol table.


I think what it's supposed to say is "... the execution of a module-level def statement ..."

> Care to file a documentation bug <URL:http://bugs.python.org/>
> describing this?


Sure, once I understand how it's really supposed to work :-)

Reply With Quote
  #8 (permalink)  
Old 08-07-2012, 03:53 PM
Paul Rubin
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

Roy Smith <roy@panix.com> writes:
>> In general, you should avoid non-idempotent code.

> I don't understand your aversion to non-idempotent code as a general
> rule. Most code is non-idempotent. Surely you're not saying we
> should never write:
>>>> foo += 1

> or
>>>> my_list.pop()

> ???


I don't think "in general avoid" means the same thing as "never write".

One of the tenets of the functional-programming movement is that it is
in fact reasonable to write in a style that avoids "foo += 1" and
"my_list.pop()" most of the time, leading to cleaner, more reliable
code.

In Python it's not possible to get rid of ALL of the data mutation
without horrendous contortions, but it's pretty easy (and IMHO of
worthwhile benefit) to avoid quite a lot of it.
Reply With Quote
  #9 (permalink)  
Old 08-07-2012, 04:49 PM
Terry Reedy
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On 8/7/2012 9:28 AM, Ramchandra Apte wrote:
> I don't think the modules are actually imported twice.


This is incorrect as Roy's original unposted example showed.
Modify one of the two copies and it will be more obvious.

PS. I agree with Mark about top posting. I often just glance as such
postings rather that go look to find out the context. However, this one
is wrong on its own ;-).

--
Terry Jan Reedy

Reply With Quote
  #10 (permalink)  
Old 08-07-2012, 05:44 PM
Mark Lawrence
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On 07/08/2012 14:18, Roy Smith wrote:
> I've been tracking down some weird import problems we've been having with
> django. Our settings.py file is getting imported twice. It has some
> non-idempotent code in it, and we blow up on the second import.
>
> I thought modules could not get imported twice. The first time they get
> imported, they're cached, and the second import just gets you a reference to the
> original. Playing around, however, I see that it's possible to import a module
> twice if you refer to it by different names. Here's a small-ish test case which
> demonstrates what I'm talking about (python 2.6.5):
>
> In directory /home/roy/play/import/foo, I've got:
>
> __init__.py (empty file)
> try.py
> broken.py
>
>
> $ cat broken.py
> print __file__
>
>
> $ cat try.py
> import broken
> import foo.broken
>
> import sys
> for m in sys.modules.items():
> if m[0].endswith('broken'):
> print m
>
>
> And when I run try.py (with foo as the current directory):
>
> $ PYTHONPATH=/home/roy/play/import python try.py
> /home/roy/play/import/foo/broken.pyc
> /home/roy/play/import/foo/broken.pyc
> ('broken', <module 'broken' from '/home/roy/play/import/foo/broken.pyc'>)
> ('foo.broken', <module 'foo.broken' from '/home/roy/play/import/foo/broken.pyc'>)
>
>
> So, it appears that you *can* import a module twice, if you refer to it by
> different names! This is surprising. It means that having non-idempotent code
> which is executed at import time is a Bad Thing.
>
> It also means that you could have multiple copies of a module's global
> namespace, depending on how your users imported the module. Which is kind of
> mind-blowing.
>


Maybe not directly applicable to what you're saying, but Brett Cannon
ought to know something about the import mechanism. I believe he's been
working on it on and off for several years. See
http://docs.python.org/dev/whatsnew/3.3.html for a starter on the gory
details.

--
Cheers.

Mark Lawrence.

Reply With Quote
  #11 (permalink)  
Old 08-07-2012, 05:54 PM
Steven D'Aprano
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On Tue, 07 Aug 2012 08:25:43 -0700, Roy Smith wrote:

> On Tuesday, August 7, 2012 9:52:59 AM UTC-4, Steven D'Aprano wrote:
>
>> In general, you should avoid non-idempotent code. You should doubly
>> avoid it during imports, and triply avoid it on days ending with Y.


You seem to have accidentally deleted my smiley.

> I don't understand your aversion to non-idempotent code as a general
> rule. Most code is non-idempotent.


That doesn't necessarily make it a good thing. Most code is also buggy.


> Surely you're not saying we should never write:
>
>>>> foo += 1

>
> or
>
>>>> my_list.pop()

>
> ???


Of course not. I'm not going so far as to say that we should always,
without exception, write purely functional code. I like my list.append as
much as anyone

But at the level of larger code units, functions and modules, it is a
useful property to have where possible. A function is free to increment
an integer, or pop items from a list, as much as it likes -- so long as
they are *local* to the function, and get reset to their initial state
each time the function is called with the same arguments.

I realise that many problems are most easily satisfied by non-idempotent
tactics. "Customer orders widget" is not naturally idempotent, since if
the customer does it twice, they get two widgets, not one. But such
behaviour should be limited to the parts of your code which must be non-
idempotent.

In short, non-idempotent code is hard to get right, hard to test, and
hard to debug, so we should use as little of it as possible.



--
Steven
Reply With Quote
  #12 (permalink)  
Old 08-07-2012, 10:47 PM
Cameron Simpson
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On 07Aug2012 13:52, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:
| On Tue, 07 Aug 2012 09:18:26 -0400, Roy Smith wrote:
| > I thought modules could not get imported twice. The first time they get
| > imported, they're cached, and the second import just gets you a
| > reference to the original. Playing around, however, I see that it's
| > possible to import a module twice if you refer to it by different names.
|
| Yes. You've found a Python gotcha.
[...]
| > $ cat try.py
| > import broken
| > import foo.broken
|
| Which are two names for the same module.
[...]

This, I think, is a core issue in this misunderstanding. (I got bitten
by this too, maybe a year ago. My error, and I'm glad to have improved
my understanding.)

All of you are saying "two names for the same module", and variations
thereof. And that is why the doco confuses.

I would expect less confusion if the above example were described as
_two_ modules, with the same source code.

Make it clear that these are _two_ modules (because they have two
names), who merely happen to have been obtained from the same "physical"
filesystem object due to path search effects i.e. change the doco
wording to describe a module as the in-memory result of reading a "file"
found from an import name.

So I think I'm arguing for a small change in terminology in the doco
with no change in Python semantics. Is a module a set of files on the
disc, or an in-memory Python notion with a name? I would argue for the
latter.

With such a change, the "a module can't be imported twice" would then be
true (barring hacking around in sys.modules between imports).

Cheers,
--
Cameron Simpson <cs@zip.com.au>

As you can see, unraveling even a small part of 'sendmail' can introduce more
complexity than answers. - Brian Costales, _sendmail_
Reply With Quote
  #13 (permalink)  
Old 08-07-2012, 11:05 PM
Roy Smith
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

In article <mailman.3071.1344380066.4697.python-list@python.org>,
Cameron Simpson <cs@zip.com.au> wrote:

> This, I think, is a core issue in this misunderstanding. (I got bitten
> by this too, maybe a year ago. My error, and I'm glad to have improved
> my understanding.)
>
> All of you are saying "two names for the same module", and variations
> thereof. And that is why the doco confuses.
>
> I would expect less confusion if the above example were described as
> _two_ modules, with the same source code.


+1
Reply With Quote
  #14 (permalink)  
Old 08-08-2012, 04:14 AM
Ben Finney
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

Cameron Simpson <cs@zip.com.au> writes:

> All of you are saying "two names for the same module", and variations
> thereof. And that is why the doco confuses.
>
> I would expect less confusion if the above example were described as
> _two_ modules, with the same source code.


That's not true though, is it? It's the same module object with two
different references, I thought.

Also, even if what you say were true, “source code” implies the module
was loaded from source code, when Python allows loading modules with no
source code available. So that implication just seems to be inviting
different confusion.

--
\ “I'm not a bad guy! I work hard, and I love my kids. So why |
`\ should I spend half my Sunday hearing about how I'm going to |
_o__) Hell?” —Homer Simpson |
Ben Finney
Reply With Quote
  #15 (permalink)  
Old 08-08-2012, 06:40 AM
Laszlo Nagy
Guest
 
Posts: n/a
Default Re: I thought I understood how import worked...

On 2012-08-08 06:14, Ben Finney wrote:
> Cameron Simpson <cs@zip.com.au> writes:
>
>> All of you are saying "two names for the same module", and variations
>> thereof. And that is why the doco confuses.
>>
>> I would expect less confusion if the above example were described as
>> _two_ modules, with the same source code.

> That's not true though, is it? It's the same module object with two
> different references, I thought.

They are not the same. Proof:

$ mkdir test
$ cd test
$ touch __init__.py
$ touch m.py
$ cd ..
$ python
Python 2.7.3 (default, Apr 20 2012, 22:39:59)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path.append('test')
>>> import m
>>> from test import m
>>> import m
>>> from test import m as m2
>>> m is m2

False
>>> m.a = 3
>>> m2.a

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'a'

So it is still true that top level code gets executed only once, when
the module is first imported. The trick is that a module is not a file.
It is a module object that is created from a file, with a name. If you
change the name, then you create ("import") a new module.

You can also use the reload() function to execute module level code
again, but it won't create a new module object. It will just update the
contents of the very same module object:

What is more interesting is how the reload() function works:

Python 2.7.3 (default, Apr 20 2012, 22:39:59)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import test.m
>>> a = test.m
>>> import os
>>> test.m is a

True
>>> os.system("echo \"import sys\" >> test/m.py")

0
>>> reload(test.m) # Updates the module object

<module 'test.m' from 'test/m.py'>
>>> test.m is a # They are still the same

True
>>> a.sys # So a.sys is a exist

<module 'sys' (built-in)>
>>>



Reply With Quote
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off




All times are GMT. The time now is 10:43 AM.


Copyright ©2009

LinkBacks Enabled by vBSEO 3.3.0 RC2 © 2009, Crawlability, Inc.