Why does Python copy numpy arrays where the length of the dimensions are the same?
I have a problem with referencing to a numpy array.
I have an array of the form
import numpy as np
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
and if I now create a new variable
b = np.array(a)
and do
b[0] += 1
print(a)
then a
is not changing.
a = [array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
But if I do the same thing with:
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6])]
so I removed one number in the end of the last dimension. Then I do this again:
b = np.array(a)
b[0] += 1
print(a)
Now a
is changing, what I thought is the normal behavior in python.
a = [array([1. , 1.2, 1.4, 1.6, 1.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])]
Can anybody explain me this?
python python-3.x numpy
New contributor
add a comment |
I have a problem with referencing to a numpy array.
I have an array of the form
import numpy as np
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
and if I now create a new variable
b = np.array(a)
and do
b[0] += 1
print(a)
then a
is not changing.
a = [array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
But if I do the same thing with:
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6])]
so I removed one number in the end of the last dimension. Then I do this again:
b = np.array(a)
b[0] += 1
print(a)
Now a
is changing, what I thought is the normal behavior in python.
a = [array([1. , 1.2, 1.4, 1.6, 1.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])]
Can anybody explain me this?
python python-3.x numpy
New contributor
5
This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.
– user2357112
46 mins ago
add a comment |
I have a problem with referencing to a numpy array.
I have an array of the form
import numpy as np
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
and if I now create a new variable
b = np.array(a)
and do
b[0] += 1
print(a)
then a
is not changing.
a = [array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
But if I do the same thing with:
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6])]
so I removed one number in the end of the last dimension. Then I do this again:
b = np.array(a)
b[0] += 1
print(a)
Now a
is changing, what I thought is the normal behavior in python.
a = [array([1. , 1.2, 1.4, 1.6, 1.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])]
Can anybody explain me this?
python python-3.x numpy
New contributor
I have a problem with referencing to a numpy array.
I have an array of the form
import numpy as np
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
and if I now create a new variable
b = np.array(a)
and do
b[0] += 1
print(a)
then a
is not changing.
a = [array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
But if I do the same thing with:
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6])]
so I removed one number in the end of the last dimension. Then I do this again:
b = np.array(a)
b[0] += 1
print(a)
Now a
is changing, what I thought is the normal behavior in python.
a = [array([1. , 1.2, 1.4, 1.6, 1.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])]
Can anybody explain me this?
python python-3.x numpy
python python-3.x numpy
New contributor
New contributor
New contributor
asked 49 mins ago
shollisholli
312
312
New contributor
New contributor
5
This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.
– user2357112
46 mins ago
add a comment |
5
This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.
– user2357112
46 mins ago
5
5
This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.
– user2357112
46 mins ago
This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.
– user2357112
46 mins ago
add a comment |
5 Answers
5
active
oldest
votes
In the first case, NumPy sees that the input to numpy.array
can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0]
is a view of the new array's first row, completely independent of a[0]
, and modifying b[0]
does not affect a[0]
.
In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array
call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0]
is the same array object that a[0]
is, and b[0] += 1
mutates that object.
This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.
add a comment |
In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.
With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.
With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.
add a comment |
When you make a np.array
with consistent lengths of lists, a new object np.ndarray
of float
s is created.
Thus, your a[0]
and b[0]
does not share the same reference.
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
b = np.array(a)
id(a[0])
# 139663994327728
id(b[0])
# 139663994324672
However, with varying lengths of lists, np.array
creates np.ndarray
with object
as its elements.
a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6])]
b2 = np.array(a2)
b2
array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
Where b2
is still keeping the same references from a2
:
for s in a2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
for s in b2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
Which makes addition to b2[0]
results in addition to a2[0]
.
add a comment |
In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
In [2]:
In [2]: a
Out[2]:
[array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
a
is a list of arrays. b
is a 2d array.
In [3]: b = np.array(a)
In [4]: b
Out[4]:
array([[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
In [5]: b[0] += 1
In [6]: b
Out[6]:
array([[1. , 1.2, 1.4, 1.6, 1.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
b
gets values from a
but does not contain any of the a
objects. The underlying data structure of this b
is very different from a
, the list. If that isn't clear, you may want to review the numpy
basics (which talk about shape, strides, and data buffers).
In the second case, b
is an object array, containing the same objects as a
:
In [8]: b = np.array(a)
In [9]: b
Out[9]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
This b
behaves a lot like the a
- both contain arrays.
The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.
It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a
, we have to do:
In [17]: b = np.empty(3, object)
In [18]: b[:] = a[:]
In [19]: b
Out[19]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)
or even for i in range(3): b[i] = a[i]
add a comment |
The primary use-case for which numpy.array()
has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.
Whenever it is possible to do this, numpy.array()
will indeed do it.
When your a
is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray()
to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)
) .
So, any change to b[0]
is actually a change to this internal data structure of numbers, which were all copied over from a
.
When your a
is a list of unequally sized ndarrays, it is no longer possible for numpy.array()
to convert this into an n-dimensional array of shape (3,5).
So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object
, and return a 1-dimensional ndarray of those object
s. The length of this returned ndarray is 3
(the number of object
s). You can see this by printing b.shape
(will print (1,)
) and b.dtype
(will print object
).
In this case, numpy.array()
does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of object
s.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
sholli is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54760478%2fwhy-does-python-copy-numpy-arrays-where-the-length-of-the-dimensions-are-the-sam%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
In the first case, NumPy sees that the input to numpy.array
can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0]
is a view of the new array's first row, completely independent of a[0]
, and modifying b[0]
does not affect a[0]
.
In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array
call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0]
is the same array object that a[0]
is, and b[0] += 1
mutates that object.
This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.
add a comment |
In the first case, NumPy sees that the input to numpy.array
can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0]
is a view of the new array's first row, completely independent of a[0]
, and modifying b[0]
does not affect a[0]
.
In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array
call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0]
is the same array object that a[0]
is, and b[0] += 1
mutates that object.
This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.
add a comment |
In the first case, NumPy sees that the input to numpy.array
can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0]
is a view of the new array's first row, completely independent of a[0]
, and modifying b[0]
does not affect a[0]
.
In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array
call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0]
is the same array object that a[0]
is, and b[0] += 1
mutates that object.
This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.
In the first case, NumPy sees that the input to numpy.array
can be interpreted as a 3x5, 2-dimensional array-like, so it does that. The result is a new array of float64 dtype, with the input data copied into it, independent of the input object. b[0]
is a view of the new array's first row, completely independent of a[0]
, and modifying b[0]
does not affect a[0]
.
In the second case, since the lengths of the subarrays are unequal, the input cannot be interpreted as a 2-dimensional array-like. However, considering the subarrays as opaque objects, the list can be interpreted as a 1-dimensional array-like of objects, which is the interpretation NumPy falls back on. The result of the numpy.array
call is a 1-dimensional array of object dtype, containing references to the array objects that were elements of the input list. b[0]
is the same array object that a[0]
is, and b[0] += 1
mutates that object.
This length dependence is one of the many reasons that trying to make jagged arrays or arrays of arrays is a really, really bad idea in NumPy. Seriously, don't do it.
answered 40 mins ago
user2357112user2357112
154k12162256
154k12162256
add a comment |
add a comment |
In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.
With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.
With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.
add a comment |
In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.
With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.
With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.
add a comment |
In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.
With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.
With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.
In a nutshell, this is a consequence of your data. You'll notice that this does not work because your arrays are not equally sized.
With equal sized sub-arrays, the elements can be compactly loaded into a memory efficient scheme where any N-D array can be represented by a compact 1-D array in memory. NumPy then handles the translation of multi-dimensional indexes to 1D indexes internally. For example, index [i, j] of a 2D array will map to i*N + j (if storing in row major format). The data from the original list of arrays is copied into a compact 1D array, so any modifications made to this array does not affect the original.
With ragged lists/arrays, this cannot be done. The array is effectively a python list, where each element is a python object. For efficiency, only the object references are copied and not the data. This is why you can mutate the original list elements in the second case but not the first.
answered 39 mins ago
coldspeedcoldspeed
131k23138222
131k23138222
add a comment |
add a comment |
When you make a np.array
with consistent lengths of lists, a new object np.ndarray
of float
s is created.
Thus, your a[0]
and b[0]
does not share the same reference.
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
b = np.array(a)
id(a[0])
# 139663994327728
id(b[0])
# 139663994324672
However, with varying lengths of lists, np.array
creates np.ndarray
with object
as its elements.
a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6])]
b2 = np.array(a2)
b2
array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
Where b2
is still keeping the same references from a2
:
for s in a2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
for s in b2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
Which makes addition to b2[0]
results in addition to a2[0]
.
add a comment |
When you make a np.array
with consistent lengths of lists, a new object np.ndarray
of float
s is created.
Thus, your a[0]
and b[0]
does not share the same reference.
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
b = np.array(a)
id(a[0])
# 139663994327728
id(b[0])
# 139663994324672
However, with varying lengths of lists, np.array
creates np.ndarray
with object
as its elements.
a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6])]
b2 = np.array(a2)
b2
array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
Where b2
is still keeping the same references from a2
:
for s in a2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
for s in b2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
Which makes addition to b2[0]
results in addition to a2[0]
.
add a comment |
When you make a np.array
with consistent lengths of lists, a new object np.ndarray
of float
s is created.
Thus, your a[0]
and b[0]
does not share the same reference.
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
b = np.array(a)
id(a[0])
# 139663994327728
id(b[0])
# 139663994324672
However, with varying lengths of lists, np.array
creates np.ndarray
with object
as its elements.
a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6])]
b2 = np.array(a2)
b2
array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
Where b2
is still keeping the same references from a2
:
for s in a2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
for s in b2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
Which makes addition to b2[0]
results in addition to a2[0]
.
When you make a np.array
with consistent lengths of lists, a new object np.ndarray
of float
s is created.
Thus, your a[0]
and b[0]
does not share the same reference.
a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
b = np.array(a)
id(a[0])
# 139663994327728
id(b[0])
# 139663994324672
However, with varying lengths of lists, np.array
creates np.ndarray
with object
as its elements.
a2 = [np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6, 0.8]),
np.array([0. , 0.2, 0.4, 0.6])]
b2 = np.array(a2)
b2
array([array([1. , 1.2, 1.4, 1.6, 1.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
Where b2
is still keeping the same references from a2
:
for s in a2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
for s in b2:
print(id(s))
# 139663994330128
# 139663994328448
# 139663994329488
Which makes addition to b2[0]
results in addition to a2[0]
.
answered 36 mins ago
ChrisChris
1,925317
1,925317
add a comment |
add a comment |
In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
In [2]:
In [2]: a
Out[2]:
[array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
a
is a list of arrays. b
is a 2d array.
In [3]: b = np.array(a)
In [4]: b
Out[4]:
array([[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
In [5]: b[0] += 1
In [6]: b
Out[6]:
array([[1. , 1.2, 1.4, 1.6, 1.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
b
gets values from a
but does not contain any of the a
objects. The underlying data structure of this b
is very different from a
, the list. If that isn't clear, you may want to review the numpy
basics (which talk about shape, strides, and data buffers).
In the second case, b
is an object array, containing the same objects as a
:
In [8]: b = np.array(a)
In [9]: b
Out[9]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
This b
behaves a lot like the a
- both contain arrays.
The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.
It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a
, we have to do:
In [17]: b = np.empty(3, object)
In [18]: b[:] = a[:]
In [19]: b
Out[19]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)
or even for i in range(3): b[i] = a[i]
add a comment |
In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
In [2]:
In [2]: a
Out[2]:
[array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
a
is a list of arrays. b
is a 2d array.
In [3]: b = np.array(a)
In [4]: b
Out[4]:
array([[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
In [5]: b[0] += 1
In [6]: b
Out[6]:
array([[1. , 1.2, 1.4, 1.6, 1.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
b
gets values from a
but does not contain any of the a
objects. The underlying data structure of this b
is very different from a
, the list. If that isn't clear, you may want to review the numpy
basics (which talk about shape, strides, and data buffers).
In the second case, b
is an object array, containing the same objects as a
:
In [8]: b = np.array(a)
In [9]: b
Out[9]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
This b
behaves a lot like the a
- both contain arrays.
The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.
It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a
, we have to do:
In [17]: b = np.empty(3, object)
In [18]: b[:] = a[:]
In [19]: b
Out[19]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)
or even for i in range(3): b[i] = a[i]
add a comment |
In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
In [2]:
In [2]: a
Out[2]:
[array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
a
is a list of arrays. b
is a 2d array.
In [3]: b = np.array(a)
In [4]: b
Out[4]:
array([[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
In [5]: b[0] += 1
In [6]: b
Out[6]:
array([[1. , 1.2, 1.4, 1.6, 1.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
b
gets values from a
but does not contain any of the a
objects. The underlying data structure of this b
is very different from a
, the list. If that isn't clear, you may want to review the numpy
basics (which talk about shape, strides, and data buffers).
In the second case, b
is an object array, containing the same objects as a
:
In [8]: b = np.array(a)
In [9]: b
Out[9]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
This b
behaves a lot like the a
- both contain arrays.
The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.
It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a
, we have to do:
In [17]: b = np.empty(3, object)
In [18]: b[:] = a[:]
In [19]: b
Out[19]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)
or even for i in range(3): b[i] = a[i]
In [1]: a = [np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8]),
...: np.array([0.0, 0.2, 0.4, 0.6, 0.8])]
In [2]:
In [2]: a
Out[2]:
[array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])]
a
is a list of arrays. b
is a 2d array.
In [3]: b = np.array(a)
In [4]: b
Out[4]:
array([[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
In [5]: b[0] += 1
In [6]: b
Out[6]:
array([[1. , 1.2, 1.4, 1.6, 1.8],
[0. , 0.2, 0.4, 0.6, 0.8],
[0. , 0.2, 0.4, 0.6, 0.8]])
b
gets values from a
but does not contain any of the a
objects. The underlying data structure of this b
is very different from a
, the list. If that isn't clear, you may want to review the numpy
basics (which talk about shape, strides, and data buffers).
In the second case, b
is an object array, containing the same objects as a
:
In [8]: b = np.array(a)
In [9]: b
Out[9]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6])], dtype=object)
This b
behaves a lot like the a
- both contain arrays.
The construction of this object array is quite different from the 2d numeric array. I think of the numeric array as the default, or normal, numpy behavior, while the object array is a 'concession', giving us a useful tool, but one which does not have the calculation power of the multidimensional array.
It is easy to make an object array by mistake - some say too easy. It can be harder to make one reliably by design. FOr example with the original a
, we have to do:
In [17]: b = np.empty(3, object)
In [18]: b[:] = a[:]
In [19]: b
Out[19]:
array([array([0. , 0.2, 0.4, 0.6, 0.8]), array([0. , 0.2, 0.4, 0.6, 0.8]),
array([0. , 0.2, 0.4, 0.6, 0.8])], dtype=object)
or even for i in range(3): b[i] = a[i]
edited 32 mins ago
answered 39 mins ago
hpauljhpaulj
113k783151
113k783151
add a comment |
add a comment |
The primary use-case for which numpy.array()
has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.
Whenever it is possible to do this, numpy.array()
will indeed do it.
When your a
is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray()
to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)
) .
So, any change to b[0]
is actually a change to this internal data structure of numbers, which were all copied over from a
.
When your a
is a list of unequally sized ndarrays, it is no longer possible for numpy.array()
to convert this into an n-dimensional array of shape (3,5).
So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object
, and return a 1-dimensional ndarray of those object
s. The length of this returned ndarray is 3
(the number of object
s). You can see this by printing b.shape
(will print (1,)
) and b.dtype
(will print object
).
In this case, numpy.array()
does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of object
s.
add a comment |
The primary use-case for which numpy.array()
has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.
Whenever it is possible to do this, numpy.array()
will indeed do it.
When your a
is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray()
to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)
) .
So, any change to b[0]
is actually a change to this internal data structure of numbers, which were all copied over from a
.
When your a
is a list of unequally sized ndarrays, it is no longer possible for numpy.array()
to convert this into an n-dimensional array of shape (3,5).
So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object
, and return a 1-dimensional ndarray of those object
s. The length of this returned ndarray is 3
(the number of object
s). You can see this by printing b.shape
(will print (1,)
) and b.dtype
(will print object
).
In this case, numpy.array()
does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of object
s.
add a comment |
The primary use-case for which numpy.array()
has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.
Whenever it is possible to do this, numpy.array()
will indeed do it.
When your a
is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray()
to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)
) .
So, any change to b[0]
is actually a change to this internal data structure of numbers, which were all copied over from a
.
When your a
is a list of unequally sized ndarrays, it is no longer possible for numpy.array()
to convert this into an n-dimensional array of shape (3,5).
So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object
, and return a 1-dimensional ndarray of those object
s. The length of this returned ndarray is 3
(the number of object
s). You can see this by printing b.shape
(will print (1,)
) and b.dtype
(will print object
).
In this case, numpy.array()
does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of object
s.
The primary use-case for which numpy.array()
has been designed, is to create an n-dimensional array of numbers, in its own efficiently designed internal structure.
Whenever it is possible to do this, numpy.array()
will indeed do it.
When your a
is a list of 3 ndarrays, each of size 5, it is clearly possible for numpy.ndarray()
to create an n-dimensional ndarray of numbers (specifically a 2-dimensional one, with shape (3,5)
) .
So, any change to b[0]
is actually a change to this internal data structure of numbers, which were all copied over from a
.
When your a
is a list of unequally sized ndarrays, it is no longer possible for numpy.array()
to convert this into an n-dimensional array of shape (3,5).
So, the function does the next best thing it can do, which is, to treat each of the 3 ndarrays as an object
, and return a 1-dimensional ndarray of those object
s. The length of this returned ndarray is 3
(the number of object
s). You can see this by printing b.shape
(will print (1,)
) and b.dtype
(will print object
).
In this case, numpy.array()
does not dive deeper into each of your 3 ndarrays to make copies of those 3 ndarrays, since it is not going to create its own efficiently designed n-dimensional array of numbers -- it is only going to return a 1-dimensional array of object
s.
answered 7 mins ago
fountainheadfountainhead
1719
1719
add a comment |
add a comment |
sholli is a new contributor. Be nice, and check out our Code of Conduct.
sholli is a new contributor. Be nice, and check out our Code of Conduct.
sholli is a new contributor. Be nice, and check out our Code of Conduct.
sholli is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54760478%2fwhy-does-python-copy-numpy-arrays-where-the-length-of-the-dimensions-are-the-sam%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
This is one of the reasons trying to make jagged arrays or arrays of arrays in NumPy is a really bad idea.
– user2357112
46 mins ago