Issue433625
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2001-06-16 02:17 by shihao, last changed 2022-04-10 16:04 by admin. This issue is now closed.
| Messages (13) | |||
|---|---|---|---|
| msg5065 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-16 02:17 | |
Mutex should be hold when calling
pthread_cond_signal(). This function should look like:
PyThread_release_lock(PyThread_type_lock lock)
{
pthread_lock *thelock = (pthread_lock *)lock;
int status, error = 0;
dprintf(("PyThread_release_lock(%p) called\n",
lock));
status = pthread_mutex_lock( &thelock->mut );
CHECK_STATUS("pthread_mutex_lock[3]");
thelock->locked = 0;
/* ***** call pthread_cond_signal before unlock
mutex */
status = pthread_cond_signal(
&thelock->lock_released );
CHECK_STATUS("pthread_cond_signal");
status = pthread_mutex_unlock( &thelock->mut );
CHECK_STATUS("pthread_mutex_unlock[3]");
/* wake up someone (anyone, if any) waiting on
the lock */
}
|
|||
| msg5066 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2001-06-17 21:16 | |
Logged In: YES user_id=31435 Why? It's allowed to signal the condition whether or not the mutex is held. Since changing this can have visible effects on thread scheduling, I'm reluctant to change it without a good reason. |
|||
| msg5067 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2001-06-17 21:51 | |
Logged In: YES user_id=6380 Set status to closed -- no need to delete it. |
|||
| msg5068 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2001-06-17 22:01 | |
Logged In: YES user_id=31435 Ack, did I delete this?! I sure didn't intend to -- didn't even intend to close it. Reopened pending more info. |
|||
| msg5069 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2001-06-17 23:02 | |
Logged In: YES user_id=31435 Closing this again, as it appears the original submitter deleted it. shihao, if you want to pursure this, open it again. |
|||
| msg5070 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-18 06:01 | |
Logged In: YES
user_id=246388
I closed it because I thought the thelock->locked variable
will ensure that the PyThread_release_lock will help to
protect the condition variable and I was wrong. The
linuxthread man page on pthread_cond_signal:
A condition variable must always be associated with a
mutex, to avoid the race condition where a thread prepares
to wait on a condition variable and another thread signals
the condition just before the first thread actually waits
on it.
which means you can't call pthread_cond_signal &
pthread_cond_wait on the same condition variable at the
same time. And using a mutex to protect them is a good
idea. Here is how thing might go wrong with current
implementation:
thread 1 thread 2
|int PyThread_acquire_lock _
|/** assume lock was acquired
| by thread 1, hence locked=0
| & success would be 0 **/
|{
| ...
| status = pthread_mutex_lo
| CHECK_STATUS("pthread_mut
| success = thelock->locked
| if (success) thelock->loc
| status = pthread_mutex_un
| /** thread 2 suspended **/
void PyThread_release_lock _|
{ |
... |
status = pthread_mutex_loc|
CHECK_STATUS("pthread_mute|
|
thelock->locked = 0; |
|
status = pthread_mutex_unl|
/** thread 1 suspend **/ |
| CHECK_STATUS("pthread_mut
|
| if ( !success && waitflag
| /* continue trying unti
|
| /* mut must be locked b
| * protocol */
| status = pthread_mutex_
| CHECK_STATUS("pthread_m
| while ( thelock->locked
| status = pthread_cond
|/** thread 2 suspended while
| updating shared data **
CHECK_STATUS("pthread_mute|
|
/* wake up someone (anyone|
status = pthread_cond_sign|
/** thread 1 update shared |
data and corrupt it. **/ |
Not sure what the effect would be. It's wouldn't be nice
anyway.
|
|||
| msg5071 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2001-06-19 20:52 | |
Logged In: YES user_id=31435 It appears you're concerned that the signal will "get lost" in this scenario. I agree that it may, but it doesn't matter: thread 2's "while (thelock->locked)" test fails because thread 1 already set thelock->locked to 0, so thread 2 doesn't execute its pthread_cond_wait, so it doesn't matter that nobody is *waiting* to see thread 1's pthread_cond_signal. IOW, the condition thread 1 will try to signal has *already* been detected by thread 2, and the signal is useless (because redundant) information. Indeed, it's a beauty of the condition protocol that signals can be sloppy. The linuxthread man page is worded strangely here, but note that it does not say the mutex *must* be locked. They can't, either, because POSIX doesn't require it; while POSIX stds aren't available freely online, some derived specs are, and are much clearer about this; e.g., see the Single Unix Specification, here: <http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_con d_signal.html> I'll tell you why I don't *want* to change this: in Python's use of the global interpreter lock, it's almost always the case that someone is waiting on the lock. By releasing the mutex before signaling, this gives a waiter a chance to run immediately upon calling pthread_cond_signal. Else, because pthread_cond_wait (which the waiters are executing) has to lock the mutex, if the signaler is holding the mutex during the signal, pthread_cond_signal can't finish the job -- it was to go back to the signaler and let it unlock the mutex first before a waiter can proceed. This makes the region of exclusion longer than it needs to be. So if there's not an actual race problem here (& I still don't see one), I don't want to change this. |
|||
| msg5072 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-19 21:41 | |
Logged In: YES user_id=246388 Oops. The problem will be arised if there is a thread 3 called PyThread_acquire_lock after thread 1 set thelock->locked to 0 and before thread 2 calling pthread_mutex_lock. "while (thelock->locked)" will success for thread 2 and it will call pthread_cond_wait and might collide with thread 1's pthread_cond_signal. |
|||
| msg5073 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2001-06-19 22:06 | |
Logged In: YES user_id=31435 > The problem will be arised if there is a thread 3 > called PyThread_acquire_lock You never mentioned thread 3 again, so I don't know what it has to do with this. > after thread 1 set thelock->locked to 0 and before > thread 2 calling pthread_mutex_lock. I understand both of those. Are you assuming that, e.g., thread 3's PyThread_acquire_lock completes in whole during this gap? I don't know what else you could mean, so let's assume that. > "while (thelock->locked)" will success for thread 2 Sure. > and it will call pthread_cond_wait Yup. > and might collide with thread 1's pthread_cond_signal. What does "collide" mean to you? All the pthread_cond_xxx functions must be implemented as if atomic, so there's no meaningful sense (to me) in which they can collide -- unless they're implemented incorrectly. Assuming they are implemented correctly, it again doesn't matter that thread 2 misses thread 1's signal, because thread *3* exploited the information thread 1 was going to signal, by acquiring the lock. It's actually good that thread 2 isn't bothered with it: there's no real info in the signal anymore (at best, if thread 2 got it, it would wake up and go "oops! it's still locked; I'll wait again"). All that matters now is whether thread 2 gets a chance to see thread *3*'s signal, at the time thread 3 releases the lock. And it will, because thread 3 can't release the lock without acquiring the mutex first, and thread 2 holds the mutex at all times except when in its pthread_cond_wait call (so thread 3 can't release the lock except when thread 2 is in pthread_cond_wait). Note that I'm not at all concerned about "fairness" here, only about races. |
|||
| msg5074 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-20 05:14 | |
Logged In: YES
user_id=246388
> I understand both of those. Are you assuming that, e.g.,
> thread 3's PyThread_acquire_lock completes in whole
during
> this gap? I don't know what else you could mean, so
let's
> assume that.
Yes, I assume thread 3 completed in this gap and it is
possible.
"Collide" means while thread 2's pthread_cond_wait was
modifiying the internal data structure, thread 1 calls
pthread_cond_signal.
The point here is not missing signals, the problem is the
possiblity that pthread_cond_signal preempt the execution
of pthread_cond_wait. I can't find any document says
pthread_cond_xxx functions must be automic operations.
In pthread_cond_xxx man page, it only mentioned:
pthread_cond_wait atomically unlocks the mutex (as per
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pthread_unlock_mutex) and waits for the condition variable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cond to be signaled. The thread execution is suspended and
^^^^^^^^^^^^^^^^^^^
does not consume any CPU time until the condition variable
is signaled. The mutex must be locked by the calling
thread on entrance to pthread_cond_wait. Before returning
to the calling thread, pthread_cond_wait re-acquires mutex
(as per pthread_lock_mutex).
Unlocking the mutex and suspending on the condition vari
able is done atomically. Thus, if all threads always
acquire the mutex before signaling the condition, this
guarantees that the condition cannot be signaled (and thus
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ignored) between the time a thread locks the mutex and the
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
time it waits on the condition variable.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I take it that between the time the mutex is locked by
pthread_mutex_lock and unlocked by pthread_cond_wait, you
can't call pthread_cond_signal. I also browse through
LinuxThread implementation and can't find pthread_cond_xxxx
is implemented automically.
I found there is another way to fix this without having to
call the pthread_cond_signal while holding the mutex. If
we do:
if (!thelock->locked)
status = pthread_cond_signal( &thelock->lock_released );
we guarantee that when pthread_cond_signal is called, the
acquire_lock code will not be in the middle of
pthread_cond_signal.
|
|||
| msg5075 - (view) | Author: Shih-Hao Liu (shihao) | Date: 2001-06-20 07:17 | |
Logged In: YES user_id=246388 > I also browse through > LinuxThread implementation and can't find > pthread_cond_xxxx is implemented automically. I found they do call __pthread_lock when updating the priority queue after take a closer look. I guess I have to look at elsewhere to figure out why my Python script is spinning. |
|||
| msg5076 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2001-10-18 16:18 | |
Logged In: YES user_id=31435 Deleted this bug report, at Shih-Hao Liu's request. |
|||
| msg351829 - (view) | Author: Kirill Smelkov (navytux) * | Date: 2019-09-11 11:34 | |
I still believe there is a race here and it can lead to MEMORY CORRUPTION and DEADLOCK: https://bugs.python.org/issue38106. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-10 16:04:08 | admin | set | github: 34637 |
| 2019-09-11 11:34:32 | navytux | set | nosy:
+ navytux messages: + msg351829 |
| 2001-06-16 02:17:17 | shihao | create | |
